Take advantage of md5 checksums for download validity

I'm fairly confident that you have, at one time or another, run across an md5checksum file as you have perused the internet. Whether it was a download file or even an application upgrade, those md5 files are there for a reason. But just what is the reason?

When someone puts a file up on a server for download, how does the host or the end-user know, for sure, the file they are about to download (or are serving up) is the valid file? What if someone hacked into the server and replaced the file with a bogus file that contained malicious code? It's happened before and it will happen again. Fortunately there is a way to avoid downloading invalid files - checking the md5 hash. The only problem is that this method only works if the host and user knows how to use md5 tools. In this tutorial you will learn how to add an md5 checksum to a file and how to run a check on a file you have downloaded.

What is md5 and checksum?

Before we continue with the actual steps, you might benefit from knowing exactly how the process of checksumming works. MD5 stands for Message Digest algorithm 5, which is a cryptographic 128 bit hash function and serves as a "fingerprint" for a digital file. A checksum is a fixed-size datum that is computed from a block of data. When it is crucial for a piece of data (such as a download) to be valid, the datum is compared to the original block the datum was computed from to check for a match. When an md5 checksum matches, the user/host can be certain the file is valid. When the md5 checksum does not match, a red flag should immediately go up and the original block of data should be discarded. If a file changes by so much as a byte, the checksum will fail.

For most users these tasks are handled from the command line. There are GUI tools available (such as GtkHASH) that can tackle the same tasks. But for the purposes of this tutorial we will stick with the command line tool.

Creating an md5 sum

For those who plan on hosting files for download, you will want to know how to create an md5 sum. This is very simple. Open up a terminal and change to the directory holding the file you want to work with. Say, for example, you want to create an md5 on the file /var/www/files/download.tgz. To do this you would change to the /var/www/files directory and issue the following command:

md5 download.tgz

The above command will output something like:

632668fb5bb3fe578033a42b4ba718f2  download.tgz

Now for those that are wanting to have an md5 checksum file available you can run that command and pipe the output to a file like so:

md5 download.tgz > download.md5

Now you can upload the download.md5 file alongside the download.tgz file so the users can run a checksum.

Running a checksum

Now that you have both files, you want to run your checksum to make sure the .tgz file is the legitimate file. To do this you would issue the command:

md5sum  download.md5

The output of the above command should look familiar (if you created the md5sum):

632668fb5bb3fe578033a42b4ba718f2  download.md5

Now run the md5sum command on the .tgz file like this:

md5sum download.tgz

The output should reveal the exact same string as shown above (the only difference being the file name will be different):

632668fb5bb3fe578033a42b4ba718f2  download.md5

If that string of characters isn't the same, the checksum didn't pass and you might be dealing with a corrupted file. In case of a corrupted file you will want to contact the host of the file or the developer. But if the strings match you know the checksum passed and the file should be safe to use.

Final thoughts

MD5 sums have been in use for quite some time. Whenever given the chance you should always take advantage of that system. Who knows, it might save you from installing a piece of malicious software some day.

Please share this article

facebooktwittergoogle_plusredditlinkedinmail


Responses to Take advantage of md5 checksums for download validity

  1. Captain Canuck November 20, 2009 at 1:17 am #

    "What if someone hacked into the server and replaced the file with a bogus file that contained malicious code?"

    Wouldn't that someone also upload another md5 hash matching the malicious replacement?
    usually the md5hash file is in the same directory as the download.

  2. Jack Wallen November 20, 2009 at 2:23 pm #

    @Captain Canuck: That is true. You could, however, create your MD5 hash and then digitally sign it.

    • Captain Canuck November 20, 2009 at 11:03 pm #

      Is it common for people to digitally sign their md5checksum?

  3. martin english November 21, 2009 at 7:18 am #

    Windows users wanting to create or verify MD5 or SHA-1checksums need to look at http://support.microsoft.com/kb/841290

  4. Rico November 21, 2009 at 9:47 am #

    Honestly, the reason i use checksums these days are to verify that large files have been downloaded completely. i've had the occasional Linux ISO that seemed to download correctly but after some troubleshooting, i'd find that the checksum didn't match the one posted. It's not a concern with torrents, but torrents can be slower than a superfast web server.

    There is the issue of malicious code, but honestly i trust my [open] sources and while it's possible malicious code could be slipped in, it's such a rare occurrence that i'm really not concerned. Not the most secure security model, i know, but i've yet to have it fail me in the fifteen or so years i've been downloading from remotes sources.

  5. Kirill November 22, 2009 at 12:58 am #

    @martin english: File Checksum Integrity Verifier is very good. Thank you. But when I create an exceptions list and try to use it with -exc parameter it is not working. What can you advise?

Leave a Reply