I have a collection called article_category
which store all article_id
belongs to the category with category_id
with data format like
You've couple of things incorrect here. category_id
should be all_category_id
. Use the join condition in $lookup
and move the $match
outside of $lookup
stage with $unwind
for optimized lookup.
Use $project
with exclusion to drop the looked up field from final response.
Something like {$project:{article_category:0}}
Try
db.article.aggregate([
{"$match":{"title":{"$regex":/example/}}},
{"$lookup":{
"from":"article_category",
"localField":"article_id",
"foreignField":"article_id",
"as":"article_category"
}},
{"$unwind":"$article_category"},
{"$match":{"article_category.all_category_id":8}}
])
For uncorrelated subquery try
db.article.aggregate([
{"$match":{"title":{"$regex":/example/}}},
{"$lookup":{
"from":"article_category",
"pipeline":[{"$match":{"all_category_id":8}}],
"as":"categories"
}},
{"$match":{"categories":{"$ne":[]}}}
])
First of all, it is all_category_id
, not category_id
. Secondly, you don't link articles - all documents will have exactly the same article_category
array. Lastly, you probably want to filter out articles that don't have matched category. The conditional pipeline should look more like this:
db.article.aggregate([
{ $match: {
title: { $regex: /example/ }
} },
{ $lookup: {
from: "article_category",
let: {
article_id: "$article_id"
},
pipeline: [
{ $match: {
$expr: { $and: [
{ $in: [ 8, "$all_category_id" ] },
{ $eq: [ "$article_id", "$$article_id" ] }
] }
} }
],
as: "article_category"
} },
{ $match: {
$expr: { $gt: [
{ $size: "$article_category"},
0
] }
} }
] )
UPDATE:
If you don't match article_id
, the $lookup
will result with identical article_category
array to all articles.
Let's say your article_category
collection has another document:
{
"article_id": 0,
"all_category_id": [5,8,10]
}
With { $eq: [ "$article_id", "$$article_id" ] }
in the pipeline the resulting article_category
is
[
{
"article_id" : 2015110920343902,
"all_category_id" : [ 5, 8, 10 ]
}
]
without:
[
{
"article_id" : 2015110920343902,
"all_category_id" : [ 5, 8, 10 ]
},
{
"article_id": 0,
"all_category_id": [ 5, 8, 10 ]
}
]
If the later is what you need, it would be way simpler to make to find requests:
db.article.find({ title: { $regex: /example/ } })
and
db.article_category.find({ all_category_id: 8 })