MD5 – The hash algorithm is now Broken!!!
Author:
This blog is authored by Aniruddha Shrotri a fellow CRYPTOcrat. Aniruddha is the CTO and Co-founder of E-Lock. E-Lock specializes in Digital Signature & Electronic Signature Software Solutions. Aniruddha has been active in the PKI domain for a long time and in a great position to write a note about what this recent NEWS about MD5 means.
You can find more information about Aniruddha on his LinkedIn Profile.
Abstract
Here is wishing all the CRYPTOcratians a very Happy and Secure New Year! But I am afraid the New Year has brought with it a disturbing piece of news – the hash algorithm MD5 is broken – not just in theory but in practice too.
Here is a very recent and good article http://news.cnet.com/8301-1009_3-10129693-83.html that describes how the weakness in MD5 was successfully exploited to create a fake website with HTTPS that would pass the browser test. Since 2004 it has been known that a weakness existed in MD5 but this is the first time this weakness has been exploited to create a practical live demonstration.
Issues that are raised by this demo
Clearly MD5 is broken and the consequences of this can be quite grave. Here are a few points to ponder over this development:
- The weakness was demonstrated by faking a certificate so that an SSL enabled fake website can successfully spoof a genuine website. This will mean that when people browse and they see the URL to contain HTTPS and/or a lock icon in the browser status bar, they will no longer be 100% assured that they are looking at a genuine website. If this assurance is gone, then how else is one supposed to verify whether one is dealing with a genuine or a fake website?
- The resources required to create this exploit were not enormously large – 200 Playstations 3 for 2 weeks. This is certainly not beyond the means of a determined hacker. As the article mentions, this much computing power could probably be acquired for $1500 from Amazon! Consider this effort against the pay-off that a rogue hacker might have in terms of harvesting identity information of hundreds of thousand unsuspecting customers of important financial institutions like banks.
- In faking a website, another issue that the hacker has to tackle is how to get victims to connect to the rogue website as opposed to the genuine website. There are a number of phishing attacks possible, starting from sending genuine looking fake emails with links that apparently go to the genuine website but the actually URL they have is of the rogue website. More advanced techniques include poisoning the DNS Cache of some common public DNS servers. This is very insidious as there is no genuine way in the hell the victim can make out that he or she is going to the fake website – even the correct URL will take him to the wrong website! Thirdly, the hackers can setup public Wi-fi access points and provide “open” and “free” network access to users. Users of such free Internet won’t even know that some of the links they are visiting are actually spoofs. In all, it is not very difficult to make unsuspecting victims go to a fake website (phishing attacks) and now it turns out it is not difficult either to give a false sense of comfort to victims by enabling the phished web sites with SSL/HTTPS.
- Currently this attack utilizes weakness in MD5. Remember sometime back it was shown that SHA-1 too is weak (though nowhere near as weak as MD5). It is only a matter of time that similar attack becomes possible on SHA-1 also. This is a grave scenario as the only two hash algorithms in use in practice are MD5 and SHA-1. The new replacement for SHA-1, i.e., SHA-2 and SHA-3 is still a long way out. SHA-3 candidates challenge has only recently been thrown open. And practical implementations of them will take even longer to come.
- This attack is not only about faking websites – it applies equally to Digital Signature applications as well. Basically, breaking of MD5 means being able to create another set of data whose MD5 hash is identical to the MD5 hash of some given data. This would mean that if I get a digitally signed document, I can create another document whose MD5 has matches with the MD5 hash of the original document and then I can simply transplant the digital signature from the original document to the new document. I can now take this new document with the proper digital signature of the original signer and the signer would never be able to repudiate having signed this new fake document. Imagine changing someone’s will, or changing the terms of a contract this way!
- One might argue that it may not be worthwhile for an attacker to spend great resources to fake a single document (remember, MD5 hash for different documents would be almost unique, so in order to fake another document, the whole exercise of faking would have to be done again). This may not be true depending on the value of the document, but this is not even required. As demonstrated in this attack referred to in the above link, it is sufficient to create a fake certificate. The fake certificate can then be used to sign any number of documents without having to spend large resources – the resources would be required only to create fake certificate.
- If a hacker fakes the certificate of an end user, he / she will have to spend again a similar amount of resources to fake each end user’s certificate he / she wants. However, if a CA’s certificate is faked, it can be used to issue fake certificates for any number of users without spending big resources for each end user certificate.
How was the fake certificate created?
A link is given in the article referred to above to a fake SSL enabled site, which the interested parties can see by clicking here: https://i.broke.the.internet.and.all.i.got.was.this.t-shirt.phreedom.org/. If you observe the SSL certificate chain, it seems the middle certificate in the chain is the fake one. That is, the root CA ”Equifax Secure Global eBusiness CA-1” has actually never issued a certificate to “MD5 Collisions Inc. (http://phreedom.org/md5)”, but if you look at the certificate chain, you would be lead to believe that it has. This is the fake certificate.
Question is how it was created. Obviously, they constructed the certificate to contain whatever they want and then transplanted the Equifax root CA’s signature from some other legitimate certificate to this fake one. This of course assumes that the MD5 hash of the to-be-signed part of the both the certificates was identical. But getting the MD5 hash of some exact data you want (the new fake certificate content) to be identical to some pre-determined value (the MD5 hash of the original certificate from which the signature was transplanted) is nearly impossible. So, it would be required to keep adding some extra data to the data you want and keep trying to match the hash. In the case of a certificate, such extra data can easily be put in a field called “certificate extension” which is pretty much allowed to contain anything including your cat’s picture. As long as the extension is not marked “critical”, the receivers are supposed to ignore any extension they do not understand.
So, I had expected to see some unknown extension in the fake certificate with potentially large random looking data, so that when this data was taken together with the rest of the certificate content, the hash would match some pre-decided hash. To my surprise, I found no such extension – all the information in the certificate is standard and well recognized by all browsers.
There is a good and detailed explanation of how the fake certificate was created if you follow the links given on the website. It is worth reading if you are interested in the details and it is explained in terms not far from what laymen would understand. Still it is very intricate in details. To summarize: they did not transplant signature from any old certificate that the root CA had signed – they specially constructed a real certificate, which they got the root CA to sign. They required the root CA to issue certificates with predictable validity period and serial number. They used the chosen prefix collision attack in which some prefix of the both the colliding data can be chosen by the attacker – this alone would probably rule out transplanting the signature from any old certificate onto the fake certificate. The major portion of collision block was absorbed in the RSA Public key modulus. Some “tumor” (as they call it) is visible in the fake certificate in the form of strange content of the “Netscape Comment” extension in the certificate.
What can be done
Having scared you enough with different things the hackers can do to make your life miserable, lets look into what can be done to alleviate the pain a bit:
- All the CAs and all digital signature softwares should stop using MD5 as algorithm.
- All CAs should revoke any certificates that used MD5 and re-issue certificates wih SHA1 or better algorithms.Note that this and the previous step actually may not help much. The reason is that the hackers wont be using the CA-issued certificate directly anyway – they will create a new fake certificate that LOOKS like being issued by the said CA and transplant the signature from a previously issued certificate that used MD5. Even though this particular attack required the team to create a new valid certificate that “looked like” the fake certificate they were going to create, it is conceivable that with some improvements, for transplanting, the hacker would just need a certificate issued by the CA that uses MD5 – does not matter whether it is expired, revoked or live.
- The browsers should start issuing at least a warning when they find any certificate that uses MD5 in a trust chain during an SSL transaction.Given both the points above, it is a wonder that companies like Verisign and Microsoft have chosen to downplay this attack.
- The digital signature softwares should all start doing the same during any signature verification – flag a warning at least if it detects use of MD5 in the signature or in any certificate up the trust chain.
- Even though in this instance there was no “unknown” extension in the certificate, it would be prudent for the browsers and digital signature verification software to flag a warning if they find any certificate in the chain to contain an unknown extension (even though marked “not critical”) which contain suspect looking data (i.e., contains ASN Octet String or Bit String data types).
Please feel free to send in your inputs using the comments section below. Additionally you could also use our group’s discussion forum link to send in your comments for Aniruddha.












