问题
I'm currently having some issues with the full text search functionality in MongoDB. Specifically when trying to match exact phrases.
I'm testing out the functionality in the mongo shell, but ultimately I'll be using Spring Data MongoDB with Java.
So I first tried running this command to search for the words "delay", "late" and the phrase "on time"
db.mycollection.find( { $text: { $search: "delay late \"on time\"" } }).explain(true);
And the resulting explain query told me:
"parsedTextQuery" : {
"terms" : [
"delay",
"late",
"time"
],
"negatedTerms" : [ ],
"phrases" : [
"on time"
],
"negatedPhrases" : [ ] },
The issues here being that I don't want to search for the word "time", but rather the phrase "on time". I do want to search for delay and late and ideally don't want to prevent the stemming.
I tried a few different permutations e.g.
db.mycollection.find( { $text: { $search: "delay late \"'on time'\"" } }).explain(true);
db.mycollection.find( { $text: { $search: "delay late \"on\" \"time\"" } }).explain(true);
But couldn't seem to get the right results. I can't see anything obvious in the documentation about this.
For my purposes should I use the full text search for individual words and the regex search functionality for phrases?
Currently working with MongoDB version 2.6.5. Thanks.
回答1:
Did you try the text search to see if it didn't behave correctly? It works as expected for me on MongoDB 2.6.7:
> db.test.drop()
> db.test.insert({ "t" : "I'm on time, not late or delayed" })
> db.test.insert({ "t" : "I'm either late or delayed" })
> db.test.insert({ "t" : "Time flies like a banana" })
> db.test.ensureIndex({ "t" : "text" })
> db.test.find({ "$text" : { "$search" : "time late delay" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }
{ "t" : "Time flies like a banana" }
{ "t" : "I'm either late or delayed" }
> db.test.find({ "$text" : { "$search" : "late delay" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }
{ "t" : "I'm either late or delayed" }
> db.test.find({ "$text" : { "$search" : "late delay \"on time\"" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }
Why is "time" in the terms
array in the explain? Because if the phrase "on time"
occurs in a document, the term time
must also. MongoDB uses the text index to the extent it can to help locate the phrase and then will check the index results to see which actually matches the full phrase and not just the terms in the phrase.
来源:https://stackoverflow.com/questions/28368883/mongodb-full-text-search-matching-words-and-exact-phrases