When we upload something embarrassing about ourselves to, lets say Facebook, thats completely our fault. But there are other subtle ways to get information about us. Lets say a few words about tracking.
Every time you visit a website you request HTML that will be rendered in your local browser. This code may include external references, so you will request them as well. Nothing to be afraid of so far.
But what happens when these external requests are used to track you? Is that possible? Lets suppose you are enjoying your favorite social network, lets say Facebook again. Even after you logout, your browser stores some cookies that identify you in Facebook. Then you visit some random website (www.randomwebsite.com) that includes any “I like” buttons, which in fact are external references to Facebook. And as Facebook has access to its own cookies, it gets a request with HTTP-Referer www.randomwebsite.com and with the cookie that identifies you. Result: Facebook knows all the sites you browse as far as they have a reference to Facebook.
We may think this is easily solvable by just getting rid of all cookies. Bad news, there are many other techniques that may be used to identify users by their browsers fingerprint. There is a very interesting study showing that combining your browser, plugins installed and basic information about your computer, 83% of users have a unique fingerprint. And this is without even being aggressive! You can imagine how many times external references include Javascript code that gets a lot of information from your computer to identify you.
Ok, time for hands-on. In order to see the magnitude of this tracking problem, I did some little experiments. I plan to create a more extensive article showing the details, but let me show you some numbers.
Experiment 1: Browsing a very popular Spanish newspaper
The result of sniffing traffic:
30 different domains requested, 15 of them used for tracking or advertising, 10 cookies created in my browser.
Not bad for a single request!
Experiment 2: Browsing top 250 sites in Spain (according to Alexa.com)
20% of traffic goes to tracking/advertising sites, 11.2 tracking requests per site, 93% of sites have external references to tracking sites.
Google and Facebook are the top companies tracking users here, almost with the same number of requests.
Finally, a few words about default options. During Kasperskys SAS2012 I had the chance to attend the excellent talk by Christopher Soghoian where he showed how default options are not completely innocent. In this case, Google Chrome and Safari have different default options for allowing third-party cookies/requests being sent when visiting a website. I will leave the reader to imagine who is more interested in tracking you between those two, but I wanted to check this out in a final experiment.
Experiment 3: Browsing top 100 sites in Spain using Chrome and Safari with default options
Basically visiting same sites with different browsers and capturing traffic.
To my surprise, there was not a big difference either in the number of requests or in the number of tracking domains requested by each browser. However there is a huge difference when talking about cookies:
Chrome: 1029 cookies, Safari: 269 cookies
Cutting to the top 100 domain requested by both browsers (representing 75% of the total number of requests) and looking for differences among both lists of requested domains, there are only five tracking domains in Chromes list not requested by Safari. So the balance inclines slightly to Chrome being more tracking-aggressive than Safari. That does not mean there are not tracking sites in Safaris list.
Final thoughts
This topic is really interesting, especially when you start finding out who is behind tracking and advertisement companies. Then you see who is getting all the data, which is somehow scary when you check the privacy policies. But I will wait until I finish more experiments to reach any conclusions about this.
However the question is: why so much tracking? The answer: money. It’s not about advertisements, is about profiling users. Just think about when you ask for a loan in your bank: you have a profile, and no matter what you say, you will get the loan if and only if the computer finds that your profile fits the requirements. In a near future, all companies could have access to super-profiles where all our data is available, and then the computer will decide …
Where is my privacy?