Single character text search alternative

戏子无情 提交于 2020-01-22 02:47:20

问题


Requirement: ensure single character ci text search over compound columns is processed in most efficient and performant way including relevance weight sorting;
Having a table create table test_search (id int primary key, full_name varchar(300) not null, short_name varchar(30) not null); with 3 mln rows suggester api call sends queries to db starting from first input character and first 20 results ordered by relevance should be returned.

Options/disadvantages:

  • like lower() / ilike over '%c%': slow on big dataset, no relevance;
  • pg_trgm with trigram based search like/ilike + compound gin/gist index: single character cannot be splitted into several trigrams so search is done via table fullscan, no relevance;
  • fulltext search via setweight(to_tsvector(lower())) gin/gist index: relevance based output but less results because of tokens exclude single characters;

Are there other options available to improve single character search? How to improve or mix mentioned above to get the best result? How to force fulltext to skip stoplist and create all possible lexemes like it is possible for sqlserver?


回答1:


Full-text search won't help you at all with this, because only whole words are indexed, and you cannot search for substrings.

The best you can probably do is use this function:

CREATE FUNCTION get_chars(text) RETURNS char(1)[]
   LANGUAGE sql IMMUTABLE AS
$$SELECT array_agg(DISTINCT x)::char(1)[] FROM regexp_split_to_table($1, '') AS x$$;

Then index

CREATE INDEX ON test_search USING gin (get_chars(full_name || short_name));

and search like

SELECT * FROM test_search
WHERE get_chars(full_name || short_name) @> ARRAY['c']::char(1)[];

For frequent characters, this query should still use a sequential scan, since that is the best access method. But for rare characters you may be faster that way.



来源:https://stackoverflow.com/questions/59389553/single-character-text-search-alternative

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!