Computer users are exposed to a variety of tracking technologies when they browse the Internet. From traditional third party tracking cookies to local storage, Flash cookies and fingerprinting.
Companies that develop browsers aim to reduce the tracking their users are exposed to on the Internet, for instance by implementing Do Not Track options or changing the way third party cookies are handled.
While that takes care of some forms of tracking, it does not touch others.
Fingerprinting became a topic back in 2010 when the EFF released an online tool to compute a browser's fingerprint. It was a first attempt to demonstrate that fingerprinting could indeed be used to track users on the Internet.
While it was common knowledge that fingerprinting was used, it was not really clear how popular it really was.
A recent study suggests that at least 1% of the top 10000 websites use fingerprinting techniques to track users. The researchers used the rankings provided by Alexa, an Amazon company, for their study.
All have in common that they extract data either directly during connection attempts or afterwards by parsing log files to identify unique data sets that can be associated to single Internet users.
It is for instance possible to retrieve the list of installed fonts, the screen size or the installed plugins from a user system.
The program the researchers used crawled the top 1 million websites according to Alexa to determine if common fingerprinting techniques were used by the sites.
While at least 1% of the top 10,000 sites have been found to use fingerprinting tracking, only 404 of the top 1 million sites according to Alexa were found to use fingerprinting.
It needs to be noted at this point in time that it is quite possible that the actual number is larger than that. First, the developers were not able to determine whether server-side fingerprinting tracking was used by a website. Second, there is no common fingerprinting standard, which means that it is possible that attempts were not detected correctly.
One interesting result is a list of fingerprinting providers that the researchers discovered.
The research paper lists detailed information about the methodology used to crawl the sites, counter-measures, and other information that you may find useful.
The script used to crawl the sites will be published in the future on the website linked above. This is also the location where the research paper can be downloaded as a pdf document.
Advertising revenue is falling fast across the Internet, and independently-run sites like Ghacks are hit hardest by it. The advertising model in its current form is coming to an end, and we have to find other ways to continue operating this site.
We are committed to keeping our content free and independent, which means no paywalls, no sponsored posts, no annoying ad formats (video ads) or subscription fees.
If you like our content, and would like to help, please consider making a contribution:
Ghacks is a technology news blog that was founded in 2005 by Martin Brinkmann. It has since then become one of the most popular tech news sites on the Internet with five authors and regular contributions from freelance writers.