Tuesday, February 22, 2011

Your browser's fingerprints

This has been out for about a year now, but I just stumbled onto it:

In order to provide a better browsing experience, your browser is prepared to tell any site it visits a number of things about itself.  For example, it may divulge what fonts it knows about, how big your screen is, which browser it is, and so forth.  This is a useful thing to do, and you might not think it would give away much information -- after all, it shouldn't be giving away too much to say you can print Zapf Dingbats and a bunch of other fonts.

Thing is, it gives away quite a bit.  The EFF provides a site, panopticlick, to let you test your own browser setup.  You might not find the results it gives particularly comforting.

In the particular case of fonts, it's not the fonts themselves, so much as the order they appear in.  This turns out to be an artifact of how your particular computer happened to stash the files when you or whoever set up your system installed them, and that's fairly random.  The odds that Zapf Dingbats happens to appear before American Typewriter Condensed Light are closer to 50% than the 0% you might expect assuming the list is sorted alphabetically.  The ordering doesn't seem to be completely arbitrary, but enough so that only a small percentage of browsers out there will actually have the exact same list, taking order into account.

Plugins are even worse.  It's quite possible that only you have your exact combination of plugins, even after sorting (for whatever reason, browsers don't seem to report plugins in a consistent order over time, so the order doesn't provide a stable fingerprint).  I haven't tracked down exactly why this is so, but I believe it's because some plugins are installed on demand as you visit sites.  Which ones you have and haven't collected will depend on which sites you've visited so far, and the exact versions will be affected by when you visited.

Some things that make little or no difference:
  • Whether you have cookies enabled
  • Whether you're using a a normal or anonymous window (e.g., Chrome's incognito feature)
  • In at least some cases, which actual browser you're using -- different browsers may still send the same signature under the covers.
For bonus points, several popular privacy-protection mechanisms can actually make your fingerprint more unique, as they leave traces in the fingerprint and relatively few people use them.  Among them: at least some means of disabling JavaScript (see EFF's paper for details).

Have I mentioned supercookies?

All in all, 80-90% of the browsers that connected to the EFF's site had a unique fingerprint.  Fewer than 1% had a fingerprint shared by more than one other browser (more technically, had an anonymity set with more than two members).  For whatever it's worth, the distribution of fingerprints displays a fine example of a "long tail".

So, yikes.

What to do?  In order of increasing effort:
  • Do nothing.  Roll over, go back to sleep.
  • Do nothing, but bear in mind that browsing is almost certainly not an anonymous activity.  Not a bad assumption in any case.
  • Read the EFF's (and anyone else's) summary of the situation and understand the pitfalls better.
  • Use an anonymizer (but first, read up a bit on the topic -- there are several good links scattered through the posts here tagged "anonymity")
  • Hack your browser to give out less specific information.  Sort those font lists.  Say "version 3.1" instead of "version 3.1.4.1.5.9".
  • Hack your browser to tell randomly varying harmless lies about its setup.  Randomness is important.  Fingerprints will drift naturally over time, but it turns out to be easy to connect a later version (X, Y, Z etc. plus a new plugin) with an earlier one (X, Y and Z).
  • Get the browser developers to change their APIs (e.g., don't give out lists of fonts at all)
  • Get the standards committees to make the underlying protocols more anonymous -- and then get the implementers to implement the standards.
Once you've done all that, sleep soundly knowing that panopticlick was just a proof-of-concept and that people seriously trying to fingerprint browsers have means at their disposal well beyond those mentioned here.

The name panopticlick is a play on Jeremy Bentham's panopticon, a prison design meant to induce "a sentiment of an invisible omniscience" or, in Bentham's own words, "a new mode of obtaining power of mind over mind, in a quantity hitherto without example."

Lovely stuff, that.

No comments: