What websites know about you and how to protect yourself
This is the second part of a mini series about privacy on the Internet. Check out the first part about IP addresses here.
Whenever you connect to a website using a web browser, mobile application or program that supports Internet connections, information are automatically made available to those sites.
We have talked about the IP address before and while it is one of the most important ones from a privacy point of view, it does not end there.
Each time your device or program makes a connection so-called header information are transferred along with it. You can check the user-agent here on this page for example.
It reveals information about the web browser that is being used, the operating system and architecture, and also where you came from.
Some sites use the information to display different types of contents to users, or prevent users from using the site or some functionality completely.
There are other methods and options to retrieve additional information. Below is a list of common technologies:
- IP Address - Always submitted, can reveal approximate location in the world and ISP/Company. You can look up your IP and the information that can be retrieved from it (geolocation) on sites like this.
- User Agent - Reveals information about the operating system and web browser.
- Cookies - Can be used to track users across sessions and across domains.
- Geolocation - Can pinpoint the user's location in the world.
- JavaScript - Scripts can reveal a lot, from the language and system time to screen resolution, supported plugins, additional persistent storage options for cookie-like snippets, and everything that is introduced in HTML5.
- HTML5 - Introduces new options including Canvas Fingerprinting.
- Plugins - Flash, Java or Silverlight can dig even deeper. They may reveal installed fonts and other system environment information.
Test your system
You can test what is revealed about your system when you connect to websites by visiting sites such as Panopticlick, the Browser Fingerprinting page, or on Browserleaks.
How to protect yourself
There is no universal solution that works for all users. There are however guidelines and best practices that limit your exposure on the Internet.
- IP Address - You can use a web proxy, virtual private network or a system like Tor to hide your device's IP address.
- User Agent - User Agents can be changed in the browser. Firefox users can use an add-on like User Agent Switcher for example and Chrome users User-Agent Switcher for Chrome.
- Cookies - It is highly recommended to disable third-party cookies in the browser. This means that only the site you connect to can set cookies while other sites that it may load data from cannot. This gets rid of most tracking cookies planted on systems by advertising or social media buttons.
- Geolocation - Browsers have set this to "ask" before it is enabled so that it is not a problem unless the default preference has been modified.
- JavaScript and HTML5 - If you use NoScript or another script blocker you disable JavaScript as well. Many sites will work without JavaScript while some won't. You can whitelist trusted sites on the other hand so that scripts can be executed on those sites. JavaScript can be disabled completely in the browser as well. Some HTML5 features, such as Canvas, can be disabled in some browsers either directly or by using add-ons.
- Plugins - If you set plugins to click to play or disable them outright, they cannot be used by sites unless you give permission first.
Now You: Did I miss something? Have additional tips? Share your thoughts in the comments below.
uMatrix + uBlock Origin + Ghostery + FlashControl + Random Agent Spoofer is the best combo IMO.
RAS doesn’t just spoof the user agent, but also protects from the various forms of fingerprinting techniques.
QuickJava also very useful for controlling activation/deactivation of Javascript, Java, Flash, Cookies, etc.
Anyone that uses Chrome must enjoy giving Google all the data they want.
I agree with Kelsey and add that tracking adds up to tens of millions of pieces of data each day. It is distilled by whatever means, and becomes generic, or is kept dicrete and used to track an individual, which costs money (law enforcement, the RIAA).
The value of cookies and DOM storage is that they can track what you bought to target ads using inexpensive automation.
I hate to disappoint anybody, but nobody cares which browser you are using or what your screenres is.
There are apps and services that can track how users behave when they visit a website. Heatmap technology is one of them. I used one of those to track a website before and the data is remarkable. It tells you who are visiting your site: which countries, who their internet providers are, where they click on your site, and even how long they stay and how they navigate through the site. Learning all that data is definitely a boon for the website owner, and potentially a bane for the website visitor. “Potentially,” as, who knows how the data would be used. It can also potentially be a win-win situation, if the website owner decides to develop ideas to answer the behavior and needs of his market. It all depends which angle you’re looking at, really. As for me, I just think we should just accept that technology is developing at hyperspeed and we should just keep trucking along. :)
Thanks for this great mini series Martin
I simply use Privoxy, the software can do every kind of protection that I need, just write filter, even canvas fingerprinting can get blocked.
^^ +1
Indeed. Been using it for years. Basically I block almost everything in the defaults, and allow specific websites in the user actions. I also have a toggle button so I can quickly flick from local proxy to none in FF for some sites if needed (but FF still has layers of protection – vast bulk is stopped by NoScript and RequestPolicy). But yes … layers of protection : modem and router protections -> then OS (hosts, privoxy, blocklists (eg peerblock or something) etc) -> then browser level
I’ve shared two softwares that help me convert AdBlock Plus list to Privoxy and the other can help Privoxy filter HTTPS, here is download link, just scroll to comment section: http://siderite.blogspot.com/2013/05/adblock-easylist-filter-and-action.html
For me, Privoxy can do Greasemonkey, Stylish, I’ve created a software named convert2privoxy that help user quickly create many kind type of filter for Privoxy, I bundled it into a pack, here it is, I would like to share it here, I think one day I will share it on other forum like Wildersecurity, the development progress of Privoxy is now slow, because the developer do it alone, so I really want to make user notice more about Privoxy and help him if they can code, Privoxy is a nice software, and I don’t want It dead like other ad filtering software.
https://www.dropbox.com/s/1suuqhpkfr80afr/Privoxy.rar?dl=0
At this time I have a lot of Privoxy filter that can do almost everything that Firefox addon can do, example block off-site request, Greasemonkey, Stylish, ability to block 100% popup, block DHTML effectively using javascript injection method..
Proxomitron also is a good choice, it had skidi’s filter set that can filter a lot ads and protect user privacy. But Proxomitron’s author is passed away.
Proximodo is a clone of Proxomitron but It never reach Privoxy level, so much thing that have to do and the developer no longer maintain that project.
WebCleaner is Python so it is slow, resource hoge but can do whatever Proxomitron can do and can filter HTTPS.
AdMuncher is now free and fast but cannot filter HTTPS, hard to add filter.
AdGuard can do everything but slow, resource hog.
So conclusion, I think Privoxy still the best because it is free, yeah free, fast and easy to add filter, easy to write addon software using Autohotkey that help you quickly add remove filter, almost everything that AdGuard can do, can filter HTTPS with ProxHTTPSProxy.
Where did my posts from 3 hours ago go?
Reposted:
– Extensions that spy/inject-ads on you (maybe they were good once, but have since been bought out”)
– Header referrals
– E-Tags
– Tracking embedded in urls (eg google search results) or unique guids added to links etc
– DOM (Local Storage)
– WebGL
.. and
– browser settings – dns precache (dns settings on pc/os), google autocomplete .. yada yada yada
– extensions that collect your data (ghostery, web rating sites etc) that share/collate where you go (may be anonymized, I’m talking in general here – or may be kept in house such as google for their own adsense) .. in other words, you could be super effective at blocking almost everything, but leave the front door wide open
Not all is bad in the world with this kind of data. I use useragent and javascript to collect data on browser type and screen resolution. It helps me keep my websites in tune with my users.
Excellent and complete article on Web Browser’s security. Thank you Sir! :-)
(One remark: the hyperlink in “How to protect yourself” about User Agent Switcher must be corrected as “https://addons.mozilla.org/en-US/firefox/addon/User-Agent-Switcher/” …)
:)
You are right, thanks and corrected!
The DOM storage is of course as much a privacy challenge as cookies, perhaps moreso. I think you’ve touched on this in your article on customizing the Firefox about:config file. From BestVPN’s writeup:
“A feature of HTML5 (the much vaunted replacement to Flash) is Web storage (also known as DOM (Document Object Model) storage). Even creepier and much more powerful than cookies, web storage is a way analogous to cookies of storing data in a web browser, but which is much more persistent, has a much greater storage capacity, and which cannot normally be monitored, read, or selectively removed from your web browser. Unlike regular HTTP cookies which contain 4 kB of data, web storage allows 5 MB per origin in Chrome,Firefox, and Opera, and 10 MB in Internet Explorer. Websites have a much greater level of control over web storage and, unlike cookies, web storage does not automatically expire after a certain length of time (i.e. it is permanent by default). When Ashkan Soltani and a team of researchers at UC Berkeley conducted a study of web tracking in in 2011, they found that of the top 100 websites surveyed, 17 used web storage, including twitter.com, tmz.com, squidoo.com, nytimes.com, hulu.com, foxnews.com, and cnn.com. Most of these connected to a third party analytics service such as Meebo, KISSanalytics, or Pollydaddy.”
In Firefox’s about:config, set dom.event.clipboardevents.enabled to FALSE and dom.storage.enabled to FALSE.
Just combine my 3 posts into one – add in DOM (Local Storage) and WebGL
Header referrals, E-Tags, tracking embdded in urls (eg google search results) or unique guids added to links
and
Your ISP adding tracking headers in unencrypted traffic
7. Extensions that spy on you (maybe they were good once, but have since been bought out”)
Hey Martin,
How about android and Ios? What can I do to protect myself cuz I cannot install extensions on chrome and firefox has limited privacy addon for mobile?
You could connect to a VPN, but that is only helping in regards to IP and location. I don’t really use mobile browsers, maybe someone else can chime in?
You can install adblock plus on Firefox for Android and subscribe to any of the tracking blocklist and that should take care of web trackers in the browser. But that would be the least of your worries.
Apps on mobile are contained onto themselves and they can include all kind of trackers without your knowledge. On android, google analytics are basically baked into the operating system. Once your hit that agree button on your first boot on of your new smartphone, you’ve essentially agree to have all your telemetry information send to google and icloud servers. But at least you could trust Google and Apple to some extend and there are opt out options. Third-party apps can roll any trackers they want and often, there’s no way to opt out.
On android, the only way to block trackers is installing host file blocklist (similar to Windows host file). But you’ve to root your phone and that’s often problematic if you don’t know what you’re doing.
My strategy for mobile safety is eco-system diversification. For example if you use an android phone, which is highly connected to Google services, what you should do is use as less as possible of any Google services so that Google doesn’t have all your data. Use Nokia Here maps instead of Google Maps, Outlook/yahoo/etc mail/calendar instead of Gmail, other cloud services for photos/video backup instead of Google Drive, other browsers instead of Google Chrome and so on…
This strategy works both ways in that Google doesn’t have the full picture of your information and third parties doesn’t have the full telemetry of your patterns. It also isolates your data so that if any of these one services are compromised, not all of your data are stolen.
Diversification not only work for your finance, but your information as well.