I'm fairly confident that you have, at one time or another, run across an md5checksum file as you have perused the internet. Whether it was a download file or even an application upgrade, those md5 files are there for a reason. But just what is the reason?
When someone puts a file up on a server for download, how does the host or the end-user know, for sure, the file they are about to download (or are serving up) is the valid file? What if someone hacked into the server and replaced the file with a bogus file that contained malicious code? It's happened before and it will happen again. Fortunately there is a way to avoid downloading invalid files - checking the md5 hash. The only problem is that this method only works if the host and user knows how to use md5 tools. In this tutorial you will learn how to add an md5 checksum to a file and how to run a check on a file you have downloaded.
What is md5 and checksum?
Before we continue with the actual steps, you might benefit from knowing exactly how the process of checksumming works. MD5 stands for Message Digest algorithm 5, which is a cryptographic 128 bit hash function and serves as a "fingerprint" for a digital file. A checksum is a fixed-size datum that is computed from a block of data. When it is crucial for a piece of data (such as a download) to be valid, the datum is compared to the original block the datum was computed from to check for a match. When an md5 checksum matches, the user/host can be certain the file is valid. When the md5 checksum does not match, a red flag should immediately go up and the original block of data should be discarded. If a file changes by so much as a byte, the checksum will fail.
For most users these tasks are handled from the command line. There are GUI tools available (such as GtkHASH) that can tackle the same tasks. But for the purposes of this tutorial we will stick with the command line tool.
Creating an md5 sum
For those who plan on hosting files for download, you will want to know how to create an md5 sum. This is very simple. Open up a terminal and change to the directory holding the file you want to work with. Say, for example, you want to create an md5 on the file /var/www/files/download.tgz. To do this you would change to the /var/www/files directory and issue the following command:
The above command will output something like:
Now for those that are wanting to have an md5 checksum file available you can run that command and pipe the output to a file like so:
md5 download.tgz > download.md5
Now you can upload the download.md5 file alongside the download.tgz file so the users can run a checksum.
Running a checksum
Now that you have both files, you want to run your checksum to make sure the .tgz file is the legitimate file. To do this you would issue the command:
The output of the above command should look familiar (if you created the md5sum):
Now run the md5sum command on the .tgz file like this:
The output should reveal the exact same string as shown above (the only difference being the file name will be different):
If that string of characters isn't the same, the checksum didn't pass and you might be dealing with a corrupted file. In case of a corrupted file you will want to contact the host of the file or the developer. But if the strings match you know the checksum passed and the file should be safe to use.
MD5 sums have been in use for quite some time. Whenever given the chance you should always take advantage of that system. Who knows, it might save you from installing a piece of malicious software some day.
Advertising revenue is falling fast across the Internet, and independently-run sites like Ghacks are hit hardest by it. The advertising model in its current form is coming to an end, and we have to find other ways to continue operating this site.
We are committed to keeping our content free and independent, which means no paywalls, no sponsored posts, no annoying ad formats (video ads) or subscription fees.
If you like our content, and would like to help, please consider making a contribution:
Ghacks is a technology news blog that was founded in 2005 by Martin Brinkmann. It has since then become one of the most popular tech news sites on the Internet with five authors and regular contributions from freelance writers.