CHECKSUM and CHECKSUM_AGG: What's the algorithm?

≡放荡痞女 提交于 2019-12-28 04:32:10

问题


We perform checksums of some data in sql server as follows:

declare @cs int;
select 
    @cs = CHECKSUM_AGG(CHECKSUM(someid, position))
from 
    SomeTable
where 
    userid = @userId
group by 
    userid;

This data is then shared with clients. We'd like to be able to repeat the checksum at the client end... however there doesn't seem to be any info about how the checksums in the functions above are calculated. Can anyone enlighten me?


回答1:


On SQL Server Forum, at this page, it's stated:

The built-in CHECKSUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.




回答2:


The CHECKSUM function doesn't provide a very good quality checksum and IMO is pretty useless for most purposes. As far as I know the algorithm isn't published. If you want a check that you can reproduce yourself then use the HashBytes function and one of the standard, published algorithms such as MD5 or SHA.




回答3:


//Quick hash sum of SQL and C # mirror Ukraine

     private Int64 HASH_ZKCRC64(byte[] Data)
    {
        Int64 Result = 0x5555555555555555;
        if (Data == null || Data.Length <= 0) return 0;
        int SizeGlobalBufer = 8000;
        int Ost = Data.Length % SizeGlobalBufer;
        int LeftLimit = (Data.Length / SizeGlobalBufer) * SizeGlobalBufer;

        for (int i = 0; i < LeftLimit; i += 64)
        {
            Result = Result
            ^ BitConverter.ToInt64(Data, i)
            ^ BitConverter.ToInt64(Data, i + 8)
            ^ BitConverter.ToInt64(Data, i + 16)
            ^ BitConverter.ToInt64(Data, i + 24)
            ^ BitConverter.ToInt64(Data, i + 32)
            ^ BitConverter.ToInt64(Data, i + 40)
            ^ BitConverter.ToInt64(Data, i + 48)
            ^ BitConverter.ToInt64(Data, i + 56);
             if ((Result & 0x0000000000000080) != 0)
             Result = Result ^ BitConverter.ToInt64(Data, i + 28); 
        }

        if (Ost > 0)
        {
           byte[] Bufer = new byte[SizeGlobalBufer];
           Array.Copy(Data, LeftLimit, Bufer, 0, Ost);
           for (int i = 0; i < SizeGlobalBufer; i += 64)
           {
               Result = Result
               ^ BitConverter.ToInt64(Bufer, i)
               ^ BitConverter.ToInt64(Bufer, i + 8)
               ^ BitConverter.ToInt64(Bufer, i + 16)
               ^ BitConverter.ToInt64(Bufer, i + 24)
               ^ BitConverter.ToInt64(Bufer, i + 32)
               ^ BitConverter.ToInt64(Bufer, i + 40)
               ^ BitConverter.ToInt64(Bufer, i + 48)
               ^ BitConverter.ToInt64(Bufer, i + 56);
               if ((Result & 0x0000000000000080)!=0)
               Result = Result ^ BitConverter.ToInt64(Bufer, i + 28); 
           }
        }

        byte[] MiniBufer = BitConverter.GetBytes(Result);
        Array.Reverse(MiniBufer);
        return BitConverter.ToInt64(MiniBufer, 0);

        #region SQL_FUNCTION
        /*  CREATE FUNCTION [dbo].[HASH_ZKCRC64] (@data as varbinary(MAX)) Returns bigint
            AS
            BEGIN
            Declare @I64 as bigint Set @I64=0x5555555555555555
            Declare @Bufer as binary(8000)
            Declare @i as int Set @i=1
            Declare @j as int 
            Declare @Len as int Set @Len=Len(@data)     

            if ((@data is null) Or (@Len<=0)) Return 0

              While @i<=@Len
              Begin
               Set @Bufer=Substring(@data,@i,8000)
               Set @j=1
                   While @j<=8000
                   Begin
                    Set @I64=@I64 
                    ^ CAST(Substring(@Bufer,@j,   8) as bigint) 
                    ^ CAST(Substring(@Bufer,@j+8, 8) as bigint) 
                    ^ CAST(Substring(@Bufer,@j+16,8) as bigint)
                    ^ CAST(Substring(@Bufer,@j+24,8) as bigint)
                    ^ CAST(Substring(@Bufer,@j+32,8) as bigint)
                    ^ CAST(Substring(@Bufer,@j+40,8) as bigint)
                    ^ CAST(Substring(@Bufer,@j+48,8) as bigint)
                    ^ CAST(Substring(@Bufer,@j+56,8) as bigint)
                    if @I64<0 Set @I64=@I64 ^ CAST(Substring(@Bufer,@j+28,8) as bigint)      
                    Set @j=@j+64    
                   End;  
               Set @i=@i+8000
              End
            Return @I64
            END
         */
        #endregion

   }



回答4:


I figured out the CHECKSUM algorithm, at least for ASCII characters. I created a proof of it in JavaScript (see https://stackoverflow.com/a/59014293/9642).

In a nutshell: rotate 4 bits left and xor by a code for each character. The trick was figuring out the "XOR codes". Here's the table of those:

var xorcodes = [
    0, 1, 2, 3, 4, 5, 6, 7,
    8, 9, 10, 11, 12, 13, 14, 15,
    16, 17, 18, 19, 20, 21, 22, 23,
    24, 25, 26, 27, 28, 29, 30, 31,
    0, 33, 34, 35, 36, 37, 38, 39,  //  !"#$%&'
    40, 41, 42, 43, 44, 45, 46, 47,  // ()*+,-./
    132, 133, 134, 135, 136, 137, 138, 139,  // 01234567
    140, 141, 48, 49, 50, 51, 52, 53, 54,  // 89:;<=>?@
    142, 143, 144, 145, 146, 147, 148, 149,  // ABCDEFGH
    150, 151, 152, 153, 154, 155, 156, 157,  // IJKLMNOP
    158, 159, 160, 161, 162, 163, 164, 165,  // QRSTUVWX
    166, 167, 55, 56, 57, 58, 59, 60,  // YZ[\]^_`
    142, 143, 144, 145, 146, 147, 148, 149,  // abcdefgh
    150, 151, 152, 153, 154, 155, 156, 157,  // ijklmnop
    158, 159, 160, 161, 162, 163, 164, 165,  // qrstuvwx
    166, 167, 61, 62, 63, 64, 65, 66,  // yz{|}~
];

The main thing to note is the bias towards alphanumerics (their codes are similar and ascending). English letters use the same code regardless of case.

I haven't tested high codes (128+) nor Unicode.



来源:https://stackoverflow.com/questions/16316009/checksum-and-checksum-agg-whats-the-algorithm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!