How to index a string array column for pg_trgm `'term' % ANY (array_column)` query?

后端 未结 2 1034
失恋的感觉
失恋的感觉 2021-02-03 12:02

I have tried an ordinary Postgres gin index as well as the pg_trgm gin_trgm_ops and gist_trgm_ops indexes (using this workaround: https://

2条回答
  •  情书的邮戳
    2021-02-03 12:14

    I created a test table and a function called f that only converts to text.

    CREATE OR REPLACE FUNCTION getNArray(el text[], count int) RETURNS text[] AS $$
      SELECT array_agg(el[random()*(array_length(el,1)-1)+1]) FROM generate_series(1,count) g(i)
    $$
    VOLATILE
    LANGUAGE SQL;
    
    DROP TABLE testGin;
    CREATE TABLE testGin(id serial PRIMARY KEY, array_column text[]);
    
    WITH t(ray) AS(
      SELECT (string_to_array(pg_read_file('words.list')::text,E'\n')) 
    ) 
    INSERT INTO testGin(array_column)
    SELECT getNArray(T.ray, 4) FROM T, generate_series(1,100000);
    

    The cast function:

    CREATE OR REPLACE FUNCTION f(arr text[]) RETURNS text AS $$
       SELECT arr::text
     LANGUAGE SQL IMMUTABLE;
    
    CREATE INDEX ON testGin USING GIN(f(array_column) gin_trgm_ops);
    

    The usage with ILIKE:

    postgres=# EXPLAIN SELECT id FROM testgin WHERE f(array_column) ilike '%test%';
                                      QUERY PLAN                                   
    -------------------------------------------------------------------------------
     Bitmap Heap Scan on testgin  (cost=34.82..1669.63 rows=880 width=4)
       Recheck Cond: (f(array_column) ~~* '%test%'::text)
       ->  Bitmap Index Scan on testgin_f_idx  (cost=0.00..34.60 rows=880 width=0)
             Index Cond: (f(array_column) ~~* '%test%'::text)
    (4 rows)
    

    If you want a more accurate search by including the % operator, you can do as bellow. This will scan the index and then, it will apply your filter:

    postgres=# explain SELECT id, array_column FROM testgin WHERE 'response' % ANY (array_column) and f(array_column) ~ 'response';
                                      QUERY PLAN                                  
    ------------------------------------------------------------------------------
     Bitmap Heap Scan on testgin  (cost=76.08..120.38 rows=1 width=85)
       Recheck Cond: (f(array_column) ~ 'response'::text)
       Filter: ('response'::text % ANY (array_column))
       ->  Bitmap Index Scan on testgin_f_idx  (cost=0.00..76.08 rows=11 width=0)
             Index Cond: (f(array_column) ~ 'response'::text)
    (5 rows)
    

提交回复
热议问题