Huge memory improvements coming to Firefox 29's pdf.js PDF reader
Mozilla launched Firefox's native PDF reader in Firefox 19 to provide users of the browser with an alternative to plugin-based readers such as Adobe PDF Reader or Foxit Reader.
The idea was to reduce the browser's dependency on plugins, and the creation of a native PDF reader did just that for PDF related plugins.
While built-in to the browser directly, Firefox users can still change the internal pdf viewer if they want to. This makes sense under certain circumstances, for instance when support for features is required that PDF.js does not support.
If you have been using Firefox's built-in PDF reader you may have noticed at times that memory consumption can shoot through the roof quite easily.
It is not uncommon that memory usage jumps by a couple of hundred Megabytes when opening pdf documents in PDF.js. While that depends largely on the document itself, it appears to be quite common that memory usage is higher than it should be.
Mozilla's master of memory Nicholas Nethercote just confirmed that improvements are coming to PDF.js that improve the program's memory consumption under certain conditions significantly.
He notes that the PDF viewers high memory consumption secured it a place on the top 5 list of Mozilla's MemShrink project.
Nicholas implemented four improvements that reduce the memory consumption greatly for certain kinds of documents:
- Image Masks - These types of images determine which parts of an image need to be drawn. The change skips one of the processing steps entirely which reduces memory usage when these types of images are processed significantly. Nicholas noticed a reduction in memory use by up to 50%.
- Image Copies - Some pdf documents consist only of images that have been added to it (one image per page). PDF.js makes five copies of each image (three in JavaScript, 2 in C++). Nicholas managed to reduce the size of copies 3 to 5 without causing any slow-downs in the process. In addition, some processing steps are skipped as well "in simple cases" which reduces memory consumption further. According to Nicolas, this saves about "128 MiB of allocations" per page.
- Black and White scanned documents - The same optimization technique that was used to optimize Image Masks has been applied to black and white scanned documents as well. By avoiding one step, both memory consumption and rendering time are reduced significantly. Nicholas mentions one large PDF document that brought Firefox's memory consumption to 7800 MiB while quickly scrolling to it. With the patch applied, this dropped to about 700 MiB.
- Parsing - The only improved that is not related to images. Strings parsed by PDF.js are often shorter than required to be optimized by SpiderMonkey's string optimization feature. Nicholas managed to get around this by combining strings to arrays.
The changes improve Firefox's built-in pdf reader significantly when documents that benefit from these optimizations are opened. This includes memory consumption mainly, but may also improve the loading time of pdf documents.
The changes will be released with Firefox 29, which means that Aurora and Nightly users benefit from them already.
Advertisement
Pdf.js has two fundamental problems.
1. It’s based on Javascript.
2. Javascript sucks.
They should have just used an already existing fast library for pdf. Like the ones SumatraPDF used.
I agree.
Oh, sure Martin, rub Nightly into my face again! :-D
Or you can just grab the beta pdf.js and enjoy the improvements now ;)
Interadasting, i thought they had stopped updating the extension.
It is what it is ;)