问题
I have a field in a table with large content (text or binary data). If I want to know if another text is equals this one, I can use a checksum to compare the two texts. I can define this field as UNIQUE
to avoid repeated content too.
My doubt is if I create a checksum field, this comparison will speed up, so PostgreSQL already does this (without need programmer intervention) or I need do this manually?
EDIT: What is better, create a checksum for a TEXT
field, use a checksum for it or the two ways are the same thing?
回答1:
There is no default "checksum" for large columns in PostgreSQL, you will have to implement one yourself.
Reply to comment
Hash indexes provide fast performance for equality checks. And they are updated automatically. But they are not fully integrated in PostgreSQL (yet), so their use is discouraged - read the manual.
And you cannot query the values and use them in your application for instance. You could do that with a checksum column, but you need to add an index for performance if your table is big and maintain the column. I would use a trigger BEFORE INSERT OR UPDATE
for that.
So, a hash index may or may not be for you. @A.H.'s idea certainly fits the problem ...
回答2:
You might read the Indexes Types manual, because basically you want to do the same as a hash-index but with your bare hands. So you might read up on the pros and cons of a hash index in PostgreSQL.
来源:https://stackoverflow.com/questions/9607063/checksum-field-in-postgresql-to-content-comparison