How to save websites to your hard drive
There are several ways to save a website to your local hard drive and they largely depend on your needs. If you only want to save textual information you can simply copy and paste the contents into a local text file on your computer. If you want to preserve the links you need to save the page in HTML format. Most browsers have the option to save a website locally but what if you need more than one page or would like the information of the links as well?
You could open every website and save it. This has some disadvantages. First, there is no link structure between the saved pages. If you want to open page 1 you have to find the index file for page 1 which is different from all other pages. Its great for single pages but not great for entire websites or networks.
Before i start with the solution I'd like to point out some of the reasons why someone would like to save a website to a local drive:
- Fear that the site will be deleted. (Maybe its hosted @ geocities or a similar site, everyone knows that sites tend to come and go pretty fast on free web hosts)
- For offline browsing. Maybe you don't have a flatrate and have to pay for the minutes you are online. It could also be that you would like to transfer the website to a PC that has no internet connection. This includes the case that you want to install a new OS, e.g. Linux, and have difficulties configuring the internet connection. You could save tutorial sites on your PC before you make the change.
- You are a collector. Maybe you want to download a site where images are posted on a daily basis, music files, or game cheat codes.
We will be using the freeware tool Httrack which is available for windows, mac os x and linux personal computers. Download it from the official website httrack.com
Every website that you save to your local drive is stored in a project file. The first step after you've started httrack is to create a new project by clicking NEXT.
Add some basic information about the project, name and category and the path where you want to save it. I suggest a drive with enough space for all of the files of the website. Please note that you can't create a new directory in the program itself.
This is the most important options screen for your project. You select an action and add urls to perform this action. If you want to download an entire website select Download web site(s) and add urls to the web address field.
If you only want to download certain file types select Get separated files. You specify the file types by clicking on set options and selecting scan rules.
You can add urls by simply typing one in the text field or by clicking add url. Clicking add url allows you to enter a website you want to download and add login information for that website. Httrack allows you to capture urls as well by using a proxy.
Set Options leads to a projects options page. You can specify lots of information here. Depth of website scan, follow external links, include / exclude files and directories and much more.
The default settings will download all internal websites and refuse to download external websites.
That means if you only want to download a website try the default settings and take a look at the result. Php files will be saved as html.