mySQL: get hash value for each row?

前端 未结 5 1799
长发绾君心
长发绾君心 2021-01-04 07:00

Currently I\'m manually creating a string where I concatenate all the values in each row in my table. I\'m hashing this string for each row to get a hash value for the curre

相关标签:
5条回答
  • 2021-01-04 07:43

    Late to the party, but...

    You could add an updated field that gets updated whenever the row changes.
    Then you only have to keep track of the timestamps, that show like this

    updated: 2017-12-30 17:51:19 
    

    This way not only you know if a row changed, but you also know when was the last update.
    MySQL command:

    ALTER TABLE mytable ADD `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
    ON UPDATE CURRENT_TIMESTAMP;
    

    The default precision is a second (that should be enough, if you only check from time to time if a row changed). But you could set a fraction of a second precision, like microseconds, if necessary

    ALTER TABLE mytable ADD `updated` timestamp(6) NOT NULL
    DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6);
    

    The timestamp will then show like

    updated: 2017-12-30 17:51:19.123456 
    

    Link to the documentation, this is for 8.0 but I've been using that on 5.7 successfully.

    0 讨论(0)
  • 2021-01-04 07:52

    Well I made a little script that could do excactly what you want, and maybe what others want... so here it goes...for PHP that is... first you have to make a list of columns of the table, then you make a "case when" statement for each column based on their type and put that in the concat_ws statement and then you hash it with sha1...i've used this method on very large tables (600000+ records) and the speed is quite good when selecting all records. also I think that it is faster to concat the required data in a concat_ws and explode it in php or whatever you are using, but that is just a hunch...

    <?
    $query= mysql_query("SHOW COLUMNS FROM $table", $linklive);
            while ($col = mysql_fetch_assoc($query)) {
                $columns[] = mysql_real_escape_string($col['Field']);
                if ($col['Key'] == 'PRI') {
                    $key = mysql_real_escape_string($col['Field']);
                }
                $columnsinfo[$col['Field']] = $col;
            }
            $dates = array("date","datetime","time");
                        $int = array("int","decimal");
                        $implcols = array();
                        foreach($columns as $col){
                            if(in_array($columnsinfo[$col]['Type'], $dates)){
                                $implcols[] = "(CASE WHEN (UNIX_TIMESTAMP(`$col`)=0 || `$col` IS NULL) THEN '[$col EMPTY]' ELSE `$col` END)";
                            }else{
                                list($type, $rest) = explode("(",$columnsinfo[$col]['Type']);
                                if(in_array($columnsinfo[$col]['Type'], $dates)){
                                    $implcols[] = "(CASE WHEN ( `$col`=0 || `$col` IS NULL ) THEN '[$col EMPTY]' ELSE `$col` END)";
                                }else{
                                    $implcols[] = "(CASE WHEN ( `$col`='' || `$col` IS NULL ) THEN '[$col EMPTY]' ELSE `$col` END)";
                                }
                            }
                        }
                        $keyslive = array();
                        //echo "SELECT $key SHA1(CONCAT_WS('',".implode(",", $columns).")) as compare FROM $table"; exit;
                        $q = "SELECT $key as `key`, SHA1(CONCAT_WS('',".implode(", ",$implcols).")) as compare FROM $table";
        ?>
    
    0 讨论(0)
  • 2021-01-04 07:56

    you could do something like

    SELECT MD5(concat(field1, field2, field3, ...)) AS rowhash
    

    but you can't get away from listing which fields you want, as concat(*) is not an option (syntax error).

    0 讨论(0)
  • 2021-01-04 07:56

    It's better to use concat_ws(). e.g. two adjacent column: 12,3 => 1,23 .

    Sorry, this still has some problems. Think about the null value, empty string, string can contain ',', etc...

    A program is required to generate the hash statement, which should replace null to specific value (for null-able columns), and also use the seldom used char/byte as separator.

    0 讨论(0)
  • 2021-01-04 08:00

    There are problems with CONCAT, e.g. CONCAT('ab', 'c') vs CONCAT('a', 'bc'). Two different rows, but result is the same. You could use CONCAT_WS(';', 'ab', 'c') to get ab;c but in case of CONCAT_WS(';', ';', '') vs CONCAT_WS(';', '', ';') you still get the same result.

    Also CONCAT(NULL, 'c') returns NULL.

    I think the best way is to use QUOTE:

    SELECT MD5(CONCAT(QUOTE(c1), QUOTE(c2), QUOTE(c3))) AS row_hash FROM t1;

    Result of: select (concat(quote('a'), quote('bc'), quote('NULL'), quote(NULL), quote('\''), quote('')));

    is: 'a''bc''NULL'NULL'\''''

    Also, don't use GROUP_CONCAT() to get hash of table, it has limit: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_group_concat_max_len

    Instead, CHECKSUM TABLE might be better, but you can't skip columns with CHECKSUM TABLE https://dev.mysql.com/doc/refman/5.7/en/checksum-table.html

    0 讨论(0)
提交回复
热议问题