I find myself needing to generate a checksum for a string of data, for consistency purposes. The broad idea is that the client can regenerate the checksum based on the payl
Javascript implementation of MD4, MD5 and SHA1. BSD license.
Other people have mentioned CRC32 already, but here's a link to the W3C implementation of CRC-32 for PNG, as one of the few well-known, reputable sites with a reference CRC implementation.
(A few years back I tried to find a well-known site with a CRC algorithm or at least one that cited the source for its algorithm, & was almost tearing my hair out until I found the PNG page.)
In my search for a JavaScript implementation of a good checksum algorithm I came across this question. Andrzej Doyle rightfully chose Adler32 as the checksum, as it is indeed easy to implement and has some excellent properties. DroidOS then provided an actual implementation in JavaScript, which demonstrated the simplicity.
However, the algorithm can be further improved upon as detailed in the Wikipedia page and as implemented below. The trick is that you need not determine the modulo in each step. Rather, you can defer this to the end. This considerably increases the speed of the implementation, up to 6x faster on Chrome and Safari. In addition, this optimalisation does not affect the readability of the code making it a win-win. As such, it definitely fits in well with the original question as to having an algorithm / implementation that is computationally light.
function adler32(data) {
var MOD_ADLER = 65521;
var a = 1, b = 0;
var len = data.length;
for (var i = 0; i < len; i++) {
a += data.charCodeAt(i);
b += a;
}
a %= MOD_ADLER;
b %= MOD_ADLER;
return (b << 16) | a;
}
edit: imaya created a jsperf comparison a while back showing the difference in speed when running the simple version, as detailed by DroidOS, compared to an optimised version that defers the modulo operation. I have added the above implementation under the name full-length to the jsperf page showing that the above implementation is about 25% faster than the one from imaya and about 570% faster than the simple implementation (tests run on Chrome 30): http://jsperf.com/adler-32-simple-vs-optimized/6
edit2: please don't forget that, when working on large files, you will eventually hit the limit of your JavaScript implementation in terms of the a and b variables. As such, when working with a large data source, you should perform intermediate modulo operations as to ensure that you do not exceed the maximum value of the integer that you can reliably store.