remove stop words without stemming in postgresql

*爱你&永不变心* 提交于 2019-12-21 01:59:25

问题


I want to remove the stop words from my data but I do not want to stem the words since the exact words matters to me. I used this query.

SELECT to_tsvector('english',colName)from tblName order by lower asc;

Is there any way that I can remove stopWords without stemming the words?

thanks


回答1:


Create your own text search dictionary and configuration:

CREATE TEXT SEARCH DICTIONARY simple_english
   (TEMPLATE = pg_catalog.simple, STOPWORDS = english);

CREATE TEXT SEARCH CONFIGURATION simple_english
   (copy = english);
ALTER TEXT SEARCH CONFIGURATION simple_english
   ALTER MAPPING FOR asciihword, asciiword, hword, hword_asciipart, hword_part, word
   WITH simple_english;

It works like this:

SELECT to_tsvector('simple_english', 'many an ox eats the houses');
┌─────────────────────────────────────┐
│             to_tsvector             │
├─────────────────────────────────────┤
│ 'eats':4 'houses':5 'many':1 'ox':3 │
└─────────────────────────────────────┘
(1 row)

You can set the parameter default_text_search_config to simple_english to make it your default text search configuration.



来源:https://stackoverflow.com/questions/42052173/remove-stop-words-without-stemming-in-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!