Exact matching subdocuments is easy, but is there a way to exact match entire document in a collection ?
I have a lot of documents with similar data, and I only need e
Not an ideal way, but really the only way to filter this out on the server is using JavaScript evaluation of the $where operator. Make sure it's used with a traditional query though to at least get some performance benefit from index selection as the JavaScript by itself cannot do that.
Consider the following:
{ "a" : 1 }
{ "a" : 1, "b" : 2 }
{ "a" : 1, "b" : 2, "c" : 3 }
{ "a" : 1, "b" : 2, "c" : 3, "d" : 4 }
So now you need to match the "third" document only. Here's the basic code concept:
var query = { "a": 1, "b": 2, "c": 3 };
var string = "";
Object.keys(query).forEach(function(key) {
if (query[key].constructor.toString().match(/(Array|Object)/) == null)
string += key + query[key].valueOf().toString();
});
query['$where'] = 'function() { ' +
'var compare = ""; ' +
'var string = "' + string + '"; ' +
'var doc = this; ' +
'delete doc._id; ' +
'Object.keys(doc).forEach(function(key) { ' +
'if (doc[key].contructor.toString().match(/(Array|Object)/) == null) ||' +
'compare += key + doc[key].valueOf().toString(); ' +
'}); ' +
'return compare == string; ' +
'};';
db.test.find(query);
Some drivers have better concepts for intermixing an external variable into code but it gives the basic idea.
You need to compute an external picture or hash from the required exact fields and values and then use the same method on the server to compute that from the current document fields. Naturally _id
is always excluded because it is unique.
You don't need the signature for sub-elements because as you said you can "exact match" those purely in the query. So it's just a matter of excluding those from the comparison generation.
The general query arguments will do most of the work, and in this case narrow that down to two documents, ideally using an index to do so. The rest of the matching is done by "brute force" JavaScript evaluation, so that only the documents with the matching signature to the fields in your query.