Convert a full website to PDF
Victor send me an email some days ago asking if I knew of a way to convert a full website into PDF format. I knew that there were several ways to convert a single page to pdf but I was not so sure about complete websites. I started to perform all kinds of searches but never came up with a free software that would convert a full website to PDF. That is, until I stumbled upon the most obvious choice of them all: The trial version of Adobe Acrobat.
Adobe Acrobat can be used as a trial version for 30 days and it actually does have the exact functionality that Victor is looking for. You can download it directly from Adobe with some nagging done by them. Before you can download the file you need to register an account and request the download link which will be send to your email account. My advise to Adobe is that if they want to lower their sales further they should make it even more complicated for users to download the trial versions.
Download and installation take a while since the download has a size of more than 250 Megabyte. Once installed though everything could not be more fluent. Press the keyboard shortcut SHIFT CTRL O or click on the Create PDF > From Web Page button. A menu opens that is asking for a url and offering several options on how to proceed from here.
You can specify the levels from the originating page or that you want to convert the whole website. It is normally a good idea to stay on the server and even on the path. If you do not do this the PDF could really download a lot of unrelated pages from other websites or areas of the same website. You will see the transfers in a box and the final version of the document will be shown at the end.
Works perfectly. The result can then be saved in the end as a pdf document.
Advertisement
Personally using http://pdfmyurl.com/
They offer a very complete API to convert web pages to PDF including full CSS2 and JavaScript, but also many layout options and options to apply a watermark or password to the resulting PDF.
PDFmyURL has a HTML to PDF API as well as a full blown SDK that saves you a lot of programming. Definitely a time saver. Also, PDFmyURL offers a very extensive web to PDF API with a lot of options. Their pricing is also very transparent as opposed to many other providers. They charge in number of PDFs/month instead of credits that nobody understands.
It worked really cool. Thanks a ton !
Most web page (not website) to PDF converters will not save “hidden” drop-down button contents.
The “hidden” drop-down buttons appear in Google Chrome print to PDF but are not visible on the website. The content (not the button) is visible on the website but there is no way to save it because Chrome only prints the open button.
Perfect examples of this problem is “Job Descriptions” at http://www.USAjobs.gov for CBP Officer. The job description is visible on the website but not the “hidden” drop-down button so you cannot save it to PDF
The below tools were able to convert the “hidden” drop-down button with huge irritating watermark:
http://www.htm2pdf.co.uk/
http://pdfmyurl.com/
http://kitpdf.com/web_to_pdf/
http://pdfmyurl.com was mentioned here before and now also offers full website to pdf conversion – see http://pdfmyurl.com/batch-web-to-pdf-api
It’s in BETA right now, but seems to do exactly what this article is about.
I tried to convert mys site. but it stops the conversion where the user name and password is asking. What to do to continue ?
Please help
Good Luck with Adobe.
As if Acrobat 11.0.10, if the site is behind a login/password, you will not be able to pdf the web site.
This is a 10 year issue that Adobe has yet to address, which is surprising due to their stellar reputation of being considerate to their customers…who pay $200+ a seat for their software.
Hello everyone! My suggestion is to have a look at this free tool http://kitpdf.com/web_to_pdf/ and try the conversion. Simple and easy to use, just upload the URL and see the results. You can also convert pdf files into epub or mobi formats, when needed.
Yah, but it only converts a single page, not a full web site (which is the subject of this thread).
You can try http://website2pdf.net/
It is recommended to enter subdirectories, otherwise the conversion lasts a long time.
thank you saved me
thank you but where is the Link.
This site does not covert web sites, In fact, it doesn’t do much better than any PDF print driver. It does not follow links on pages, it does not download and covert anything but what you see on the screen.
Now if you add an option “Follow links on pages Y/N”, and “Stay inside the domain Y/N” then it will be worth something.
thanks a billion
Hello,
thank you very much. If it is still true, htat would have saved me endless searching! :-)
Wow, the download is realy huge.
Greetings
Juy Juka
Valuable info. Lucky me I found your website accidentally, and I am stunned why this twist of fate didn’t took place earlier! I bookmarked it.
Not good, it just do it for a single page… I need this for my whole website!
there’s a little online tool at http://www.renderhtml.com that can convert websites to pdf.
Yah, but it only converts a single page, not a full web site (which is the subject of this thread).
than you very much, was searching this for quite a bit
Thanks for this tip. I just converted a whole html e-book (over 300 pages with illustrations) and it worked beautifully!
Adobe Acrobat can be a great tool for archiving websites to PDF when it works, but but its severely limited by crashes for bigger use use due to out of memory errors. The out of memory error or IO errors that come up mean the program terminates and the pdf capture you see on the screen can never be saved or recovered, even if you preserve the temp file.
I do legal research, and often I need to capture a blog (eg. on blogger) so I have a permanent record of what was written and what the links link to (since blogs can change and content can be deleted). If I try to do a 2-level capture, its usually ok, but it does not get enough data – it will link out to level and that is it. Usually you need at least 3 (link out to another page, and the subsequent content/document hosted there). However when I do a 3-level capture of an entire blog site, after about 8000 pdf pages, I get out of memory errors in XP or Win7 on machines with 1 to 4 gigs of ram and 40 – 500 gigs of free hard disk space.
I thought it was just me but I tried in on 4 different machines and read the forums and this is a common unresolved problem. It has something to do with the size of the temporary folder, and/or the file size limitations of Acrobat and Windows. It seems that Acrobat creates a single temp pdf file in a directory of the downloaded site and either that file gets too big for it to handle or when you try to save the file it gets too big for the temp folder, regardless of how much disk space you have – emptying the temp folder does not help. There is no resolution to this issue yet – I had this problem with Acrobat 7 and 8, and it still exists in the Acrobat X trial I used last week. Adobe should try to re-engineer the save mechanism so that its caps limits the temp file size and then processes it into a saved PDF before continuing, or allows you to set a page limit, or enables you to continue a capture from a certain page/file count.
Bottom line, its a great tool if you use it mildly and limit the depth/size of your searches. If you don’t, you will waste hours of time and bandwidth since the resulting file will just crash the program and be irrecoverable, so caution. Peace-out from Ottawa.
oh oh oh.. I’ve got a problem with Adobe Acrobat. The sequence is not proper of the pages. Example-
link1->link-2-[data]
link1->link-3-[data]
then again
link1->link-2-[data]…
I think all the data under link1 must come continuously.. which is not in my case… I can’t make proper page numbers because of this problem…
Any one faced the same problem???
yes i am facing the problem given bellow:
1) web page convert to pdf option is not converting hindi or devnagri script font web page how to do it ?
Thanks, I did it with Adobe Acrobat. Yesterday I wasted my 5 hours to do this task and now the software did it for me in 15 minutes..
I like this iste very much
Free Songs|
Love ALL
Now that over a year has past, do we know if there are any good WordPress plugins that will convert a series of posts into a single PDF? I tried poking around myself but still see that the default is to convert a single page or post. Any updates would be appreciated.
Hi,
I am getting “Authorization Failure” error message when I give the Windows Authenticated SharePoint portal URL. Could you please let me know how to fix this?
Regards,
Prakash
Try pdfmyurl, it’s free and you don’t need to download or install anything.
http://pdfmyurl.com
Yah, but it only converts a single page, not a full web site (which is the subject of this article).
Actually it also converts a whole website to PDF now. Please check out http://pdfmyurl.com/entire-website-to-pdf
thanks
Very nice indeed! Very useful.
Does anybody know if the links are preserved in the conversion?
thank you but where is the Link
Cocoa tutorial
Acrobat did convert my website to pdf.
However, I have a page with tree nodes that calls a .js page for expansion functions. After conversion, the pdf document was not able to open the tree nodes.
Any help on how I could get the tree nodes to expand, contract in the Pdf document would be a big help.
It LOOKS like it’s going to work then I get a message…
‘Nothing Done’
Authorization failure and the url
Is there a way to get around this?
Thanks
great tutorial!!!But hey you can use primopdf too i think
Do you know of any open source products that will convert websites to PDFs or is Acrobat my only option?
This converter is online and free http://website2pdf.net/
Wow, this is a nice choice. This converts whole website into pdf. I myself downloaded 2000 pages of a learning site!!
how can i convert it to pdf
Using Firefox and the “Scrapbook” add-on I captured a website of 450+ pages–over 17 MB in size–but all links intact, all graphics, etc. Using Acrobat’s create pdf from webpage as Martin described, the final file was just over 2 MB–and worked perfectly. Bravo Martin!!
where is the link
iamdrin check this site:
http://www.htm2pdf.co.uk/default.aspx
That only converts a page, not a whole site.
How to convert single page into pdf? Webpage from hard drive…please help :0)
use cutepdf
cool feature, I’ll have to give it a try
great,
thank you very much.
If you don’t want to use Acrobat but want to pull whole websites into your computer use HTTrack (http://www.httrack.com).
What it does: It’s like the Google Chrome “Save Page as HTML” but it pulls ALL pages on the site into one folder, not only the current page.
It’s so good you won’t even think you’re accessing the site offline—from .png, .gif to .zip and .mp3, nothing is left behind. Site is perfectly copied.
Practical for HUGE sites and if you don’t have enough memory to open 5000+ pages PDFs.
I use Print Friendly’s BOOKMARKLET. If using Firefox, just drag the icon where it says “Add Print Friendly to your browser” on this web page: http://www.printfriendly.com/ to your browser client and you are all set. I turned on VIEW > TOOLBARS> BOOKMARKS so you can drag it to the Bookmarks toolbar that now appears below the Address bar. When you find a web page you want to print, just click on the Print Friendly ICON on the Bookmarks toolbar and voila! You have created a beautiful PDF doc without all the popups and other extraneous junk on the web page.
http://www.printfriendly.com/
ONCE AGAIN, drag the “Print Friendly” icon button to your browser toolbar, preferably the Bookmarks toolbar and that is all you need to do. No install reqd. No need to use your PRINT TO from the File menu.