问题
I have a synonyms.txt file with content as below
car accessories, gadi marmat
and I am indexing car accessories as a single token so that it will expand to car accessories and gadi marmat.
i want the whole synonyms to match so that when query for gadi marmat, the record with car accessories to be returned.
I am using shingle filter factory to expand query so that when searching for gadi marmat, it will be expanded to gadi, gadi marmat and marmat, and since gadi marmat is queried as a single token, it should have matched car accessories and returned result but this is not the case, but when i search for car accessories, it is returning result. So must be prblm with indexing synonyms that have multiple words.
Please suggest.
回答1:
synonym file is use only to change a word that are you searching. so if you write
car accessories => gadi marmat
when a compiler matching on "car accessories", it try to matching on "gadi marmat"
it works like a single token
you can get good results mixing analyzer elements like that
@AnalyzerDef(name = "integram",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = StopFilterFactory.class, params = {
@Parameter(name = "words", value = "lucene/dictionary/stopwords.txt"),
@Parameter(name = "ignoreCase", value = "true"),
@Parameter(name = "enablePositionIncrements", value = "true")
}),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
@Parameter(name = "language", value = "English")
}),
@TokenFilterDef(factory = SynonymFilterFactory.class, params = {
@Parameter(name = "synonyms", value = "lucene/dictionary/synonyms.txt"),
@Parameter(name = "expand", value = "false")
}),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
@Parameter(name = "language", value = "English")
})
})
来源:https://stackoverflow.com/questions/10714720/multiword-synonyms-with-solr-and-hibernate-search