问题
I have two databases for logging stuff, which I want to check for synchronization.
The solution approved is to send periodically (lets say hourly) a select to both, generate a hash of the result set and compare them. If they match then great, otherwise generate some alarms.
Currently I'm doing it by (bash script):
log_table="SELECT column1, column2, column3 FROM log_table where to_char(timestamp, '$ts_format') = '$tx_moment'";
PSQL="psql -t -q -h $_gp_host -U $_gp_user -d log_schema -c ";
echo "`${PSQL} $tx_fix${log_table} | sort | cksum`";
I would like to do the cksum/hash on the postgres side. Because currently it downloads the whole result set (which can have even 25MB or more) and generates the checksum on the server side.
Google didn't help.
Any suggestions?
Thanks.
回答1:
You could use md5:
log_table="
SELECT
md5(column1 || column2 || column3) AS hash,
column1, column2, column3
FROM log_table where to_char(timestamp, '$ts_format') = '$tx_moment'";
回答2:
If you want to do a hash over all of it at once, that's going to use up a lot of memory server-side as well. And once you hit 1Gb, it won't work anymore since a single string can't be longer than that.
Perhaps something like this will work, which basically does a hash of each row, and then hashes those hashes. It will still break whe nthe length of hashes go above 1Gb - you'll need to write a custom md5 aggregate to get around that.
SELECT md5(concat(md5(column1 || column2 || column3))) FROM log_table WHERE ...
This requires that you have created the custom aggregate concat like this:
CREATE AGGREGATE concat (
BASETYPE = text,
SFUNC = textcat,
STYPE = text,
INITCOND = ''
);
来源:https://stackoverflow.com/questions/3533314/how-to-generate-a-hash-of-the-result-set-in-postgress