Full Text Search: Noise words are being searched for

这一生的挚爱 提交于 2019-12-21 17:56:52

问题


I have a database in SQL Server 2008 with Full Text Search indexes. I have defined the Stopword 'al' in the Stoplist. However, when I search for any phrase with the keyword 'al', the word 'al' is still uesd in ranking.

This might be related to the fact that I am breaking up search terms, and reconstructing them. I am then searching across multiple fields and ranking the results: http://pastebin.com/fdce11ff. This functions to break up a search

'al hamra' 

into

("*al*" ~ "*hamra*") OR ("*al*" OR "*hamra*") 

for the Full Text Search.

Imagine this scenario:

Name: Al Hamra, Author: Jack Brown, Genre: Fiction Al Karawan, Author: Al Hanz, Genre: Romance

Now a search for 'al hamra' will return 'Al Karawan', in spite of the fact that 'al' is in the stoplist. Why is this? I thought stoplists would cause words to lose their weightage?


回答1:


Noise words are specific to code pages; have you added it to the right one? You can use sys.dm_fts_parser to test it (below) this also might work better than your manual word breaking in the code (or not).

SELECT special_term, display_term
FROM sys.dm_fts_parser
  (' "al hamra" ', 1033, 0, 0)

Assuming you are using code page 1033. If your noise word is in the code page you expect then it should be visible as a noiseword in the list.



来源:https://stackoverflow.com/questions/1875237/full-text-search-noise-words-are-being-searched-for

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!