Is there a checksum algorithm that also supports “subtracting” data from it?

后端 未结 1 1317
旧时难觅i
旧时难觅i 2021-02-20 13:43

I have a system with roughly a 100 million documents, and I\'d like to keep track of their modifications between mirrors. In order to exchange information about modifications ef

相关标签:
1条回答
  • 2021-02-20 14:16

    How about

    hash = X(documents, 0, function(document) { ... })
    

    where X is an aggregate XOR (javascript-y pseudocode follows):

    function X(documents, x, f)
    {
       for each (var document in documents)
       {
          x ^= f(document);
       }
       return x;
    }
    

    and f() is a hash of individual document information? (whether timestamp or filename or ID or whatever)

    The use of XOR would allow you to "subtract" out documents, but using a hash on a per-document basis allows you to preserve a hash-like quality of detecting small changes.

    0 讨论(0)
提交回复
热议问题