How to exact match entire document?

前端 未结 3 1289
花落未央
花落未央 2021-01-21 01:33

Exact matching subdocuments is easy, but is there a way to exact match entire document in a collection ?

I have a lot of documents with similar data, and I only need e

相关标签:
3条回答
  • 2021-01-21 02:18

    i don't think this is possible outright, but a possible solution is to hash the document.

    when saving, always create a hash of the document:

    var doc = {};
    delete doc.hash; // never include the hash itself in the calculation
    doc.hash = crypto.createHash('sha256').update(JSON.stringify(doc)).digest();
    db.collection.insert(doc);
    

    Then when querying, you can query by hash:

    db.collection.find({
      hash: hash
    })
    

    might be annoying if you frequently do atomic updates on the document.

    0 讨论(0)
  • 2021-01-21 02:19

    I really don't understand your question, can you explain it ?

    If you want documents that doesnt have some fields you can use $exists.

    For example, if you have...

    {a: 1 , b: "1", c: true }
    {a: 2, b: "2", c: false}
    {a: null, b: "3" }
    

    Then db.my_collection.find({a: {$exists: true}}); finds

    {a: 1 , b: "1", c: true }
    {a: 2, b: "2", c: false}
    

    And db.my_collection.find({a: {$exists: false}}); finds

    {a: null, b: "3" }
    
    0 讨论(0)
  • 2021-01-21 02:22

    Not an ideal way, but really the only way to filter this out on the server is using JavaScript evaluation of the $where operator. Make sure it's used with a traditional query though to at least get some performance benefit from index selection as the JavaScript by itself cannot do that.

    Consider the following:

    { "a" : 1 }
    { "a" : 1, "b" : 2 }
    { "a" : 1, "b" : 2, "c" : 3 }
    { "a" : 1, "b" : 2, "c" : 3, "d" : 4 }
    

    So now you need to match the "third" document only. Here's the basic code concept:

    var query = { "a": 1, "b": 2, "c": 3 };
    var string =  "";
    
    Object.keys(query).forEach(function(key) {
        if (query[key].constructor.toString().match(/(Array|Object)/) == null) 
            string += key + query[key].valueOf().toString();
    });
    
    query['$where'] = 'function() { ' +
        'var compare =  ""; ' +
        'var string = "' + string + '"; ' +
    
        'var doc = this; ' +
        'delete doc._id; ' +
    
        'Object.keys(doc).forEach(function(key) { ' +
            'if (doc[key].contructor.toString().match(/(Array|Object)/) == null) ||' +
              'compare += key + doc[key].valueOf().toString(); ' +
        '}); ' +
        'return compare == string; ' +
    '};';
    
    db.test.find(query);
    

    Some drivers have better concepts for intermixing an external variable into code but it gives the basic idea.

    You need to compute an external picture or hash from the required exact fields and values and then use the same method on the server to compute that from the current document fields. Naturally _id is always excluded because it is unique.

    You don't need the signature for sub-elements because as you said you can "exact match" those purely in the query. So it's just a matter of excluding those from the comparison generation.

    The general query arguments will do most of the work, and in this case narrow that down to two documents, ideally using an index to do so. The rest of the matching is done by "brute force" JavaScript evaluation, so that only the documents with the matching signature to the fields in your query.

    0 讨论(0)
提交回复
热议问题