Currently I\'m manually creating a string where I concatenate all the values in each row in my table. I\'m hashing this string for each row to get a hash value for the curre
There are problems with CONCAT
, e.g. CONCAT('ab', 'c')
vs CONCAT('a', 'bc')
. Two different rows, but result is the same. You could use CONCAT_WS(';', 'ab', 'c')
to get ab;c
but in case of CONCAT_WS(';', ';', '')
vs CONCAT_WS(';', '', ';')
you still get the same result.
Also CONCAT(NULL, 'c')
returns NULL
.
I think the best way is to use QUOTE
:
SELECT MD5(CONCAT(QUOTE(c1), QUOTE(c2), QUOTE(c3))) AS row_hash FROM t1;
Result of: select (concat(quote('a'), quote('bc'), quote('NULL'), quote(NULL), quote('\''), quote('')));
is: 'a''bc''NULL'NULL'\''''
Also, don't use GROUP_CONCAT() to get hash of table, it has limit: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_group_concat_max_len
Instead, CHECKSUM TABLE
might be better, but you can't skip columns with CHECKSUM TABLE
https://dev.mysql.com/doc/refman/5.7/en/checksum-table.html