MongoDB Full and Partial Text Search

こ雲淡風輕ζ 提交于 2019-11-26 05:27:17

问题


Env:

  • MongoDB (3.2.0) with MongoS

Collection:

  • users

Text Index creation:

  BasicDBObject keys = new BasicDBObject();
  keys.put(\"name\",\"text\");

  BasicDBObject options = new BasicDBObject();
  options.put(\"name\", \"userTextSearch\");
  options.put(\"unique\", Boolean.FALSE);
  options.put(\"background\", Boolean.TRUE);

  userCollection.createIndex(keys, options); // using MongoTemplate

Document:

  • {\"name\":\"LEONEL\"}

Queries:

  • db.users.find( { \"$text\" : { \"$search\" : \"LEONEL\" } } ) => FOUND
  • db.users.find( { \"$text\" : { \"$search\" : \"leonel\" } } ) => FOUND (search caseSensitive is false)
  • db.users.find( { \"$text\" : { \"$search\" : \"LEONÉL\" } } ) => FOUND (search with diacriticSensitive is false)
  • db.users.find( { \"$text\" : { \"$search\" : \"LEONE\" } } ) => FOUND (Partial search)
  • db.users.find( { \"$text\" : { \"$search\" : \"LEO\" } } ) => NOT FOUND (Partial search)
  • db.users.find( { \"$text\" : { \"$search\" : \"L\" } } ) => NOT FOUND (Partial search)

Any idea why I get 0 results using as query \"LEO\" or \"L\"?

Regex with Text Index Search is not allowed.

db.getCollection(\'users\')
     .find( { \"$text\" : { \"$search\" : \"/LEO/i\", 
                          \"$caseSensitive\": false, 
                          \"$diacriticSensitive\": false }} )
     .count() // 0 results

db.getCollection(\'users\')
     .find( { \"$text\" : { \"$search\" : \"LEO\", 
                          \"$caseSensitive\": false, 
                          \"$diacriticSensitive\": false }} )
.count() // 0 results

Mongo Documentation:

  • https://docs.mongodb.com/v3.2/text-search/
  • https://docs.mongodb.com/manual/reference/operator/query/text/
  • https://docs.mongodb.com/manual/core/index-text/
  • https://jira.mongodb.org/browse/SERVER-15090

回答1:


As at MongoDB 3.4, the text search feature is designed to support case-insensitive searches on text content with language-specific rules for stopwords and stemming. Stemming rules for supported languages are based on standard algorithms which generally handle common verbs and nouns but are unaware of proper nouns.

There is no explicit support for partial or fuzzy matches, but terms that stem to a similar result may appear to be working as such. For example: "taste", "tastes", and tasteful" all stem to "tast". Try the Snowball Stemming Demo page to experiment with more words and stemming algorithms.

Your results that match are all variations on the same word "LEONEL", and vary only by case and diacritic. Unless "LEONEL" can be stemmed to something shorter by the rules of your selected language, these are the only type of variations that will match.

If you want to do efficient partial matches you'll need to take a different approach. For some helpful ideas see:

  • Efficient Techniques for Fuzzy and Partial matching in MongoDB by John Page
  • Efficient Partial Keyword Searches by James Tan

There is a relevant improvement request you can watch/upvote in the MongoDB issue tracker: SERVER-15090: Improve Text Indexes to support partial word match.




回答2:


As Mongo currently does not supports partial search by default...

I created a simple static method.

import mongoose from 'mongoose'

const PostSchema = new mongoose.Schema({
    title: { type: String, default: '', trim: true },
    body: { type: String, default: '', trim: true },
});

PostSchema.index({ title: "text", body: "text",},
    { weights: { title: 5, body: 3, } })

PostSchema.statics = {
    searchPartial: function(q, callback) {
        return this.find({
            $or: [
                { "title": new RegExp(q, "gi") },
                { "body": new RegExp(q, "gi") },
            ]
        }, callback);
    },

    searchFull: function (q, callback) {
        return this.find({
            $text: { $search: q, $caseSensitive: false }
        }, callback)
    },

    search: function(q, callback) {
        this.searchFull(q, (err, data) => {
            if (err) return callback(err, data);
            if (!err && data.length) return callback(err, data);
            if (!err && data.length === 0) return this.searchPartial(q, callback);
        });
    },
}

export default mongoose.models.Post || mongoose.model('Post', PostSchema)

How to use:

import Post from '../models/post'

Post.search('Firs', function(err, data) {
   console.log(data);
})



回答3:


Without creating index, we could simply use:

db.users.find({ name: /<full_or_partial_text>/i}) (case insensitive)




回答4:


I wrapped @Ricardo Canelas' answer in a mongoose plugin here on npm

Two changes made: - Uses promises - Search on any field with type String

Here's the important source code:

// mongoose-partial-full-search

module.exports = exports = function addPartialFullSearch(schema, options) {
  schema.statics = {
    ...schema.statics,
    makePartialSearchQueries: function (q) {
      if (!q) return {};
      const $or = Object.entries(this.schema.paths).reduce((queries, [path, val]) => {
        val.instance == "String" &&
          queries.push({
            [path]: new RegExp(q, "gi")
          });
        return queries;
      }, []);
      return { $or }
    },
    searchPartial: function (q, opts) {
      return this.find(this.makePartialSearchQueries(q), opts);
    },

    searchFull: function (q, opts) {
      return this.find({
        $text: {
          $search: q
        }
      }, opts);
    },

    search: function (q, opts) {
      return this.searchFull(q, opts).then(data => {
        return data.length ? data : this.searchPartial(q, opts);
      });
    }
  }
}

exports.version = require('../package').version;

Usage

// PostSchema.js
import addPartialFullSearch from 'mongoose-partial-full-search';
PostSchema.plugin(addPartialFullSearch);

// some other file.js
import Post from '../wherever/models/post'

Post.search('Firs').then(data => console.log(data);)



回答5:


import re

db.collection.find({"$or": [{"your field name": re.compile(text, re.IGNORECASE)},{"your field name": re.compile(text, re.IGNORECASE)}]})


来源:https://stackoverflow.com/questions/44833817/mongodb-full-and-partial-text-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!