问题
Objective
Find out possible differences in the following MongoDB queries and understand why one of them works and the other doesn't.
Background
A while ago I posted a questions asking for help regarding a MongoDB query:
- Using $push with $group with pymongo
In that question my query didn't work, and I was looking for a way to fix it. I had a ton of help in the comments, and eventually found out the solution, but no one seems to be able to explain me why my first incorrect query doesn't work, and the second one does.
Code
1st (incorrect) query:
pipeline = [
{"$group": {"_id": "$user.screen_name", "tweet_texts": {"$push": "$text"}, "count": {"$sum": 1}}},
{"$project": {"_id": "$user.screen_name", "count": 1, "tweet_texts": 1}},
{"$sort" : {"count" : -1}},
{"$limit": 5}
]
2nd query:
pipeline = [
{"$group": {"_id": "$user.screen_name", "tweet_texts": {"$push": "$text"}, "count": {"$sum": 1}}},
{"$sort" : {"count" : -1}},
{"$limit": 5}
]
Now, the mindful eye will see that the difference between the two queries is the project stage {"$project": {"_id": "$user.screen_name", "count": 1, "tweet_texts": 1}}
.
At the time I thought this stage was necessary, but since I am already selecting the fields I need in the $group
stage, I don't really need it. In fact, this additional and unnecessary stage was causing the tests to fail.
Question
If the $project
stage in the first example is useless and does the same thing as the $group
stage, why was my code failing? Shouldn't it make no difference at all (since the change is idempotent?)
回答1:
In the first query, after the group stage, the user screen name value is saved under the _id
key. Not under the user.screen_name
key, therefore, that value will not be projected since there is no key.
If you modify your projection, using
{"$project": {"_id": "$_id", "count": 1, "tweet_texts": 1}},
or
{"$project": {"_id": 1, "count": 1, "tweet_texts": 1}},
or
{"$project": {"count": 1, "tweet_texts": 1}},
first pipeline will be similar like second pipeline.
来源:https://stackoverflow.com/questions/40970583/what-is-the-difference-between-these-two-mongodb-queries