PDF Masher, Turn PDF Documents Into HTML Documents

Martin Brinkmann
Jul 12, 2011
Updated • Nov 22, 2012
Software, Windows software
|
8

PDF Masher has been designed for users who read ebooks on their mobile devices. PDF is not the best format for that purpose, considering that it is not possible to change a document's font size for instance. While it is possible to use the device's zoom function to read the document, it is usually not a very comfortable option, especially for large documents.

The HTML format offers an alternative. While it is often not that pretty to look at it provides better controls to read and work with the text of a document. Tools like Calibre can convert pdf documents into various formats. Their disadvantage is that they often do not get it completely right, so that header, footer and other textual information are added that are not really needed to read the text.

Enter PDF Masher. The Open Source software turns pdf documents into HTML pages. Instead of relying on guesswork or an algorithm to extract text from the pdf document, it asks the user to identify and select the text that should be available in the next document.

You can load a pdf document via the Open File button at the top of the interface. PDF Masher scans the document and displays all the text that it found in a table like structure.

Displayed in the sortable table are the font size, x and y position, text length and the text itself among other data. This makes identification of text that you want included in the resulting document relatively easy. A click on a row display that row's text in the lower half of the screen. Here it is possible to add, edit or delete text directly. That's helpful if the automatic text detection created some mistakes that need to be corrected.

It is furthermore possible to ignore single or multiple text ids automatically so that they do not turn up in the new document.

Lines can also be set as footnotes and titles. Footnotes are for instance automatically added to the last page of the document, so that they do not appear in the document.

The developer has created a small video that demonstrates the programs functionality.

PDF Masher is a handy program for users who want better control and readability on their mobile devices. The manual conversion options may take longer than automatic conversions, but they ensure that the accessibility of the document is improved.

Users who want to convert multiple documents at once need to look at other programs for the job. If it is just one document, then PDF Masher is one the best options, provided that you are fine with the resulting HTML format.

PDF Masher is available for Mac OXS, Linux and windows operating systems. It can be downloaded from the developer website.

Advertisement

Tutorials & Tips


Previous Post: «
Next Post: «

Comments

  1. nad rosenberg said on July 27, 2011 at 5:22 pm
    Reply

    How does PDF Masher handle bullets? How about tables?

  2. computerfella said on July 13, 2011 at 12:25 am
    Reply

    You should have mentioned you are limited to 40+ hours of use before it quits working. I had downloading stuff only to find out it is begware.

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.