Ambiguous contract is ambiguous [Minor bug fix for CRC32 and MD5Managed HashAlgorithm implementations]
Kind reader Gregor Zurowski contacted me over the weekend to let me know that he was using my free CRC-32 HashAlgorithm implementation in his project and he'd found that omitting the call to the Initialize
method lead to incorrect hash values being returned. My first thought was, "Well, sure, calling the Initialize
method before using the class is a necessary part of the HashAlgorithm contract - if you don't satisfy the contract then problems like this are entirely possible.". But Gregor went on to say that he got perfectly good results from the .NET Framework's MD5/SHA1 implementations without needing to call the Initialize
method. I wondered if maybe the Framework classes had anticipated this scenario and allowed callers to skip the initialize call, but I was beginning to suspect that my understanding of the contract was wrong and that my implementation had a bug...
Sure enough, when I consulted the documentation for the Initialize
method there was no mention of a contractual need to call it prior to doing anything else: "Initializes an implementation of the HashAlgorithm class.". I poked around a bit more trying to find some kind of justification for my beliefs, but I couldn't and was soon forced to conclude that I'd simply been mistaken by assuming that the .NET Framework followed the same pattern as COM or Win32. What's worse, I'd made the mistake twice: once for my CRC32
class and again with my MD5Managed
class. :(
Well, as long as I was taking some time to validate my assumptions, I decided to check up on my assumption that a final call to TransformFinalBlock
was required prior to fetching the hash value. Fortunately, I was right about this one and the TransformFinalBlock
documentation notes that "You must call the TransformFinalBlock
method after calling the TransformBlock
method but before you retrieve the final hash value.". So while I'd clearly started out on the wrong foot, at least I didn't botch the landing, too. :)
The good news here is that this mistake is trivial to detect during application development (failing to call the Initialize
method always yields the wrong hash value) and it's also trivial to work around: simply call the Initialize
method after constructing a new instance of one of these HashAlgorithm subclasses. In fact, if your application already does this (as my ComputeFileHashes suite does), then you are completely unaffected by this issue.
Of course, I didn't want to perpetuate this error any further than I already had, so I corrected the code for the CRC32
and MD5Managed
classes by performing the initialization as part of the object construction process. I then updated the following resources:
- My CRC-32 blog post which includes the source code for the CRC-32 implementation
- My managed MD5 blog post which includes the source code for the managed MD5 implementation
- The MD5Managed test harness source code download which contains the MD5Managed.cs file
- The ComputeFileHashes source code download which contains both CRC32.cs and MD5Managed.cs
As I note above, there was no need to update the ComputeFileHashes application binaries because they were written under the assumption that a call to Initialize
was required - and therefore are correct whether or not a call to Initialize
actually is required.
Again, my thanks go out to Gregor for bringing this issue to my attention - I sincerely hope that no one else was affected by this mistake. While I do try to strive for perfection, I obviously need all the help I can get along the way! :)