Sorting By Relevancy in MongoDB without Exceeding the Memory Buffer

问题

I'm playing with the recent support for full text search in MongoDB but finding it's limitations so severe it's not very usable. Perhaps I am misunderstanding how it works and someone could enlighten me. I want to display this most relevant results first so this means my query needs to look like:

db.properties.find({$text: {$search: "My Search"}}, {score: { $meta: "textScore" }}).sort({score: {$meta: "textScore"}})

But I find unless my search is VERY specific I quickly get:

Executor error: Overflow sort stage buffered data usage of 33566146 bytes exceeds internal limit of 33554432 bytes

A little research shows that the sort is basically happening in memory and the sorted data (i.e. the score for each record) must fit in 32MB or you get this error.

A bit more research and I found I can help it get more specific by making my full text index part of a compound index. If I can automatically scope my query to a certain subset of the data then I am less likely to get this error as the result set will be small enough to fit in the 32MB.

The nature of my application allows me to partially do that but not fully due to the restrictions on the compound index. The fields for which I am scoping must be simple equality filters so I cannot use operators like $nin.

End result is:

If a user's query results in too many documents I get an error.
I can pre-scope their query to limit this but the filters I can use in this case must be very simplistic.

This all makes me think that full text search is really only useful for small datasets. If you get too large of a data set you can no longer sort by relevancy unless you can limit the scope back down to a small data set through very simplistic operators.

来源：https://stackoverflow.com/questions/33311158/sorting-by-relevancy-in-mongodb-without-exceeding-the-memory-buffer

标签

mongodb

full-text-search

relevance