Convert HTML files to Plain Text properly - gHacks Tech News

Convert HTML files to Plain Text properly

There are a couple of reasons why you'd want to convert local or online HTML files to the plain text format (.txt). Maybe you want to move the files to a device that can't read or display HTML files properly, or maybe, you'd like to turn multiple HTML documents into a single text document for easier archiving, or, you just need the textual information from the documents to use them for work.

While you can now go ahead and use copy and paste to do that, or go through the source code manually, you may quickly realize that it takes some time to do so. Going through the source code is usually not the best option as you may end up copying HTML tags to the new document which are not interpreted in the plain txt file. Depending on the HTML files structure, you may also have issues copying its textual contents when you view it in a browser.

Nirsoft's HTMLasText comes to the rescue as it provides you with an automated way of converting HTML files to plain text. The program has been designed to work with single and multiple HTML files as long as the documents are stored in a single folder or folder structure on your hard drive. You can use wildcards to select the HTML files on your drive and wildcards for the corresponding txt files as well.

You simply select the HTML root folder and define whether you want to convert a single file or multiple files using wildcards. If you have HTML documents in a subfolder select the scan subfolder option here as well.

convert html to text

The conversion options define several output parameters. Here you can select the maximum number of characters per line and which characters you want used as a representation of unordered lists. HTMLAsText not only extracts the text from HTML documents but preserves part of the document formatting as well.

Additional formatting related options are available to highlight heading tags (h1 to h6) by using underlines, skip the title tag, enclose bold text with characters you select and to allow centered or right-aligned text as well.

You can save the configuration to load it at anytime in the future which may be useful if you need to convert HTML documents to text regularly.The conversion itself does not take longer than a second for a single document, and the quality of the output is quite good. While you may still need to manually edit the text document, for instance by removing navigational elements or menus that you do not need, the program's formatting preservation helps to limit that to a fraction of the time you'd normally spend doing so.

Advertisement

We need your help

Advertising revenue is falling fast across the Internet, and independently-run sites like Ghacks are hit hardest by it. The advertising model in its current form is coming to an end, and we have to find other ways to continue operating this site.

We are committed to keeping our content free and independent, which means no paywalls, no sponsored posts, no annoying ad formats or subscription fees.

If you like our content, and would like to help, please consider making a contribution:


Previous Post: «
Next Post: »

Comments

  1. tom said on December 26, 2012 at 12:59 pm
    Reply

    Martin.

    merry christmas

    Recenty I cam to a case were I needed to covert .lyr to JPG however no free sofrware available? Do you have any such thinks in your memory?

  2. Shawn said on December 26, 2012 at 3:47 pm
    Reply

    @tom…

    The lyr files are from what based program? as .lyr can be lyrics, DataCad, GPS imagery files, and even in the medical field

    Knowing the source of the file will help the rest of us help you out…

  3. jmjsquared said on December 26, 2012 at 9:26 pm
    Reply

    @tom – Give this a try: ArcGIS Explorer Desktop. It’s part of a mapping suite of software, is free and opens MXD LYR 3DD files.

    http://www.esri.com/software/arcgis/explorer/download

  4. Jim said on December 27, 2012 at 11:36 am
    Reply

    @tom
    I’ve been using a program called pearl mountain image converter which will convert lyr to jpg etc (i don’t use it for this but for other reason anyway i’ll give you my key as there’s a watermark on trial as you seem to need to convert the lyr bad,

    download from here: http://www.pearlmountainsoft.com/pearlmountain-image-converter/index.html

    my serial http://pastebin.com/r9kNw4G1
    cheers

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

Please note that your comment may not appear immediately after you post it.