Sunday, January 11, 2009

A bit more on MD5 cracking in practice

While we're on the topic, I should point out that Sotirov et. al. were not the first to put the theoretical weakness of MD5 into practice. The timeline from my previous post is
  • A while ago, someone published a paper showing that MD5 was vulnerable.
  • Late in 2008, Sotirov et. al. disclosed that they had forged a root certificate.
  • Soon thereafter (right about now), the CAs got serious about updating their root certificates.
All well and good, but there are a few missing pieces (and even then, this is just scratching the surface):
  • Rivest introduced MD5 in 1991
  • The first theoretical indication that MD5 was weak came in 1996, when Dobbertin published a paper on the subject.
  • In 2005 Wang and Yu published a paper with the straightforward title "How to Break MD5 and Other Hash Functions", including fully-worked examples.
  • Also in 2005, Lenstra, Wang and de Weber published a paper announcing that, using Wang's method, they had produced colliding X.509 certificates (the kind everyone uses). At this point, one could make a very strong argument that the cat was out of the bag as far as applications like SSL were concerned. A couple of key quotes:
    • "With this construction we show that MD5 collisions can be crafted easily in such a way that the principles underlying the trust in Public Key Infrastructure [the basis for SSL] are violated."
    • Below is an example pair of colliding certificates in full detail (byte dump).
  • In 2007, Lenstra and de Weber, along with Stevens, got some press by claiming to have predicted the outcome of the 2008 presidential election, and then, once they had your attention, explaining that they'd really just made multiple predictions with identical MD5 hashes.
  • Finally, in late 2008, we join our story currently in progress.
In other words, MD5 has been known to be weak in theory for more than a decade, and in practice for several years. As usual, Wikipedia has more background and pointers to original sources (several of which I used here).

No comments: