It is basically what the title says.
Input: myArray
= an array of words
I have an model that have field
wordsCollection
, which is
Let say we have the following documents in our collection:
{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d03"), "a" : [ 0, 8, 432, 9, 34, -3 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }
and the following input array and n = 2
var inputArray = [1, 3, 0];
We can return those documents where the array field contains at least n elements of a given array using the aggregation framework.
The $match selects only those documents with the array's length greater or equals to n
. This reduce the amount of data to be processed in down in the pipeline.
The $redact pipeline operator use a logical condition processing using the $cond operator and the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "discard" the document where the condition is false.
In our case, the condition is $gte which returns true if the $size of the intersection of the two arrays, which we compute using the $setIntersection operator is greater than or equal 2
.
db.collection.aggregate(
[
{ "$match": { "a.1": { "$exists": true } } },
{ "$redact": {
"$cond": [
{ "$gte": [
{ "$size": { "$setIntersection": [ "$a", inputArray ] } },
2
]},
"$$KEEP",
"$$PRUNE"
]
}}
]
)
which produces:
{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }
Use aggregation.
In $match aggregation pipeline, you can use $size and $gte