I\'m using the chechsum function in sql server 2008 R2 and I would like to get the same int values in a C# app. Is there any equivalent method in c# that returns the values
In case you need to do a checksum on a GUID, change dna2's answer to this:
private int SQLBinaryChecksum(byte[] text)
With a byte array, the value from SQL will match the value from C#. To test:
var a = Guid.Parse("DEAA5789-6B51-4EED-B370-36F347A0E8E4").ToByteArray();
Console.WriteLine(SQLBinaryChecksum(a));
vs SQL:
select BINARY_CHECKSUM(CONVERT(uniqueidentifier,'DEAA5789-6B51-4EED-B370-36F347A0E8E4'))
both answers will be -1897092103.
On SQL Server Forum, at this page, it's stated:
The built-in CHECKUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.
I was able to port the BINARY_CHECKSUM to c# and it seems to be working... I'll be looking at the plain CHECKSUM later...
private int SQLBinaryChecksum(string text)
{
long sum = 0;
byte overflow;
for (int i = 0; i < text.Length; i++)
{
sum = (long)((16 * sum) ^ Convert.ToUInt32(text[i]));
overflow = (byte)(sum / 4294967296);
sum = sum - overflow * 4294967296;
sum = sum ^ overflow;
}
if (sum > 2147483647)
sum = sum - 4294967296;
else if (sum >= 32768 && sum <= 65535)
sum = sum - 65536;
else if (sum >= 128 && sum <= 255)
sum = sum - 256;
return (int)sum;
}
@Dan's implementation of BinaryChecksum can be greatly simplified down in c# down to
int SqlBinaryChecksum(string text)
{
uint accumulator = 0;
for (int i = 0; i < text.Length; i++)
{
var leftRotate4bit = (accumulator << 4) | (accumulator >> -4);
accumulator = leftRotate4bit ^ text[i];
}
return (int)accumulator;
}
This also makes it clearer what the algorithm is doing. For each character, a 4 bit circular shift then an xor with character's byte
The T-SQL documentation does not specify what algorithm is used by checksum()
outside of this:
CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indexes. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index. This can be used for equality searches over the columns.
It's unlikely to compute an MD5 hash, since its return value (the computed hash) is a 32-bit integer; an MD5 hash is 128 bits in length.
CHECKSUM
docs don't disclose how it computes the hash. If you want a hash you can use in T-SQL and C#, pick from the algorithms supported in HashBytes