The blog of dlaa.me

Ambiguous contract is ambiguous [Minor bug fix for CRC32 and MD5Managed HashAlgorithm implementations]

Kind reader Gregor Zurowski contacted me over the weekend to let me know that he was using my free CRC-32 HashAlgorithm implementation in his project and he'd found that omitting the call to the Initialize method lead to incorrect hash values being returned. My first thought was, "Well, sure, calling the Initialize method before using the class is a necessary part of the HashAlgorithm contract - if you don't satisfy the contract then problems like this are entirely possible.". But Gregor went on to say that he got perfectly good results from the .NET Framework's MD5/SHA1 implementations without needing to call the Initialize method. I wondered if maybe the Framework classes had anticipated this scenario and allowed callers to skip the initialize call, but I was beginning to suspect that my understanding of the contract was wrong and that my implementation had a bug...

Sure enough, when I consulted the documentation for the Initialize method there was no mention of a contractual need to call it prior to doing anything else: "Initializes an implementation of the HashAlgorithm class.". I poked around a bit more trying to find some kind of justification for my beliefs, but I couldn't and was soon forced to conclude that I'd simply been mistaken by assuming that the .NET Framework followed the same pattern as COM or Win32. What's worse, I'd made the mistake twice: once for my CRC32 class and again with my MD5Managed class. :(

Well, as long as I was taking some time to validate my assumptions, I decided to check up on my assumption that a final call to TransformFinalBlock was required prior to fetching the hash value. Fortunately, I was right about this one and the TransformFinalBlock documentation notes that "You must call the TransformFinalBlock method after calling the TransformBlock method but before you retrieve the final hash value.". So while I'd clearly started out on the wrong foot, at least I didn't botch the landing, too. :)

The good news here is that this mistake is trivial to detect during application development (failing to call the Initialize method always yields the wrong hash value) and it's also trivial to work around: simply call the Initialize method after constructing a new instance of one of these HashAlgorithm subclasses. In fact, if your application already does this (as my ComputeFileHashes suite does), then you are completely unaffected by this issue.

Of course, I didn't want to perpetuate this error any further than I already had, so I corrected the code for the CRC32 and MD5Managed classes by performing the initialization as part of the object construction process. I then updated the following resources:

As I note above, there was no need to update the ComputeFileHashes application binaries because they were written under the assumption that a call to Initialize was required - and therefore are correct whether or not a call to Initialize actually is required.

Again, my thanks go out to Gregor for bringing this issue to my attention - I sincerely hope that no one else was affected by this mistake. While I do try to strive for perfection, I obviously need all the help I can get along the way! :)