问题
I want to remove the stop words from my data but I do not want to stem the words since the exact words matters to me. I used this query.
SELECT to_tsvector('english',colName)from tblName order by lower asc;
Is there any way that I can remove stopWords without stemming the words?
thanks
回答1:
Create your own text search dictionary and configuration:
CREATE TEXT SEARCH DICTIONARY simple_english
(TEMPLATE = pg_catalog.simple, STOPWORDS = english);
CREATE TEXT SEARCH CONFIGURATION simple_english
(copy = english);
ALTER TEXT SEARCH CONFIGURATION simple_english
ALTER MAPPING FOR asciihword, asciiword, hword, hword_asciipart, hword_part, word
WITH simple_english;
It works like this:
SELECT to_tsvector('simple_english', 'many an ox eats the houses');
┌─────────────────────────────────────┐
│ to_tsvector │
├─────────────────────────────────────┤
│ 'eats':4 'houses':5 'many':1 'ox':3 │
└─────────────────────────────────────┘
(1 row)
You can set the parameter default_text_search_config
to simple_english
to make it your default text search configuration.
来源:https://stackoverflow.com/questions/42052173/remove-stop-words-without-stemming-in-postgresql