Take advantage of md5 checksums for download validity

Jack Wallen
Nov 20, 2009
Updated • Dec 28, 2012
Linux
|
7

I'm fairly confident that you have, at one time or another, run across an md5checksum file as you have perused the internet. Whether it was a download file or even an application upgrade, those md5 files are there for a reason. But just what is the reason?

When someone puts a file up on a server for download, how does the host or the end-user know, for sure, the file they are about to download (or are serving up) is the valid file? What if someone hacked into the server and replaced the file with a bogus file that contained malicious code? It's happened before and it will happen again. Fortunately there is a way to avoid downloading invalid files - checking the md5 hash. The only problem is that this method only works if the host and user knows how to use md5 tools. In this tutorial you will learn how to add an md5 checksum to a file and how to run a check on a file you have downloaded.

What is md5 and checksum?

Before we continue with the actual steps, you might benefit from knowing exactly how the process of checksumming works. MD5 stands for Message Digest algorithm 5, which is a cryptographic 128 bit hash function and serves as a "fingerprint" for a digital file. A checksum is a fixed-size datum that is computed from a block of data. When it is crucial for a piece of data (such as a download) to be valid, the datum is compared to the original block the datum was computed from to check for a match. When an md5 checksum matches, the user/host can be certain the file is valid. When the md5 checksum does not match, a red flag should immediately go up and the original block of data should be discarded. If a file changes by so much as a byte, the checksum will fail.

For most users these tasks are handled from the command line. There are GUI tools available (such as GtkHASH) that can tackle the same tasks. But for the purposes of this tutorial we will stick with the command line tool.

Creating an md5 sum

For those who plan on hosting files for download, you will want to know how to create an md5 sum. This is very simple. Open up a terminal and change to the directory holding the file you want to work with. Say, for example, you want to create an md5 on the file /var/www/files/download.tgz. To do this you would change to the /var/www/files directory and issue the following command:

md5 download.tgz

The above command will output something like:

632668fb5bb3fe578033a42b4ba718f2  download.tgz

Now for those that are wanting to have an md5 checksum file available you can run that command and pipe the output to a file like so:

md5 download.tgz > download.md5

Now you can upload the download.md5 file alongside the download.tgz file so the users can run a checksum.

Running a checksum

Now that you have both files, you want to run your checksum to make sure the .tgz file is the legitimate file. To do this you would issue the command:

md5sum  download.md5

The output of the above command should look familiar (if you created the md5sum):

632668fb5bb3fe578033a42b4ba718f2  download.md5

Now run the md5sum command on the .tgz file like this:

md5sum download.tgz

The output should reveal the exact same string as shown above (the only difference being the file name will be different):

632668fb5bb3fe578033a42b4ba718f2  download.md5

If that string of characters isn't the same, the checksum didn't pass and you might be dealing with a corrupted file. In case of a corrupted file you will want to contact the host of the file or the developer. But if the strings match you know the checksum passed and the file should be safe to use.

Final thoughts

MD5 sums have been in use for quite some time. Whenever given the chance you should always take advantage of that system. Who knows, it might save you from installing a piece of malicious software some day.

Advertisement

Previous Post: «
Next Post: «

Comments

  1. Kirill said on November 22, 2009 at 12:58 am
    Reply

    @martin english: File Checksum Integrity Verifier is very good. Thank you. But when I create an exceptions list and try to use it with -exc parameter it is not working. What can you advise?

  2. Rico said on November 21, 2009 at 9:47 am
    Reply

    Honestly, the reason i use checksums these days are to verify that large files have been downloaded completely. i’ve had the occasional Linux ISO that seemed to download correctly but after some troubleshooting, i’d find that the checksum didn’t match the one posted. It’s not a concern with torrents, but torrents can be slower than a superfast web server.

    There is the issue of malicious code, but honestly i trust my [open] sources and while it’s possible malicious code could be slipped in, it’s such a rare occurrence that i’m really not concerned. Not the most secure security model, i know, but i’ve yet to have it fail me in the fifteen or so years i’ve been downloading from remotes sources.

  3. martin english said on November 21, 2009 at 7:18 am
    Reply

    Windows users wanting to create or verify MD5 or SHA-1checksums need to look at http://support.microsoft.com/kb/841290

  4. Jack Wallen said on November 20, 2009 at 2:23 pm
    Reply

    @Captain Canuck: That is true. You could, however, create your MD5 hash and then digitally sign it.

    1. Captain Canuck said on November 20, 2009 at 11:03 pm
      Reply

      Is it common for people to digitally sign their md5checksum?

  5. Captain Canuck said on November 20, 2009 at 1:17 am
    Reply

    “What if someone hacked into the server and replaced the file with a bogus file that contained malicious code?”

    Wouldn’t that someone also upload another md5 hash matching the malicious replacement?
    usually the md5hash file is in the same directory as the download.

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.