问题
I want to decide between using a citext
column with an index or using a text
column with an index on lower()
.
I performed some benchmarks. To my surprise the search with the index on lower()
causes an index scan, but in the citext
case I get an index only scan. I was expecting the index on lower()
to cause an index only scan too.
Also, the total cost with the citext
index is 4.44, but with the index on lower()
, the total cost is 8.44.
So the first thing coming to my mind is that the citext
column index is better than the expression index on a text
column.
CREATE TABLE test_citext(a citext NOT NULL);
INSERT INTO test_citext
SELECT cast(x as text)
FROM generate_series(1, 1000000) AS x;
VACUUM (FREEZE, ANALYZE) test_citext;
create index citextind on test_citext(a);
Select * from test_citext where a = 'test';
--Index Only Scan.Total cost 4.44
CREATE TABLE test_textlowerindex(a text NOT NULL);
INSERT INTO test_textlowerindex
SELECT cast(x as text)
FROM generate_series(1, 1000000) AS x;
VACUUM (FREEZE, ANALYZE) test_textlowerindex;
create index lowertextind on test_textlowerindex(lower(a));
Select * from test_textlowerindex where lower(a) = 'test';
--Index Scan.Total cost 8.44
Am I right?
Mister Laurenz Albe thanks for your answer.I changed my above script as you said. Result :
CREATE TABLE test_citext(a citext NOT NULL);
INSERT INTO test_citext
SELECT cast(x as text)
FROM generate_series(1, 1000000) AS x;
create index citextind on test_citext(a);
VACUUM (FREEZE, ANALYZE) test_citext;
Select count(*) from test_citext where a = 'test';
--Index Only Scan 4.44 + 4.46
CREATE TABLE test_textlowerindex(a text NOT NULL);
INSERT INTO test_textlowerindex
SELECT cast(x as text)
FROM generate_series(1, 1000000) AS x;
create index lowertextind on test_textlowerindex(lower(a));
VACUUM (FREEZE, ANALYZE) test_textlowerindex;
Select count(*) from test_textlowerindex where lower(a) = 'test';
--Index Scan 8.44 + 8.46
But anything did not changed Even If I run analyze after creating index and using count(*) in select.Index Scan still continues with an index on lower().
回答1:
Your test is misleading. There are two problems here:
You didn't run
ANALYZE
after creating indexlowertextind
.Without that, PostgreSQL doesn't know how
lower(a)
is distributed and will probably produce a wrong cost estimate.By using
SELECT *
you inadvertently allowed an index only scan to be used for the first query, but not for the second. This is because the first index contains all table columns, but the second doesn't.Since the second index doesn't contain
a
, the value has to be fetched from the table, causing additional work.You could use
SELECT count(*) FROM ...
for a fairer benchmark.
来源:https://stackoverflow.com/questions/55627892/postgresql-citext-index-vs-lower-expression-index-performance