What is the difference between these two MongoDB queries?

混江龙づ霸主 提交于 2019-12-10 22:59:07

问题


Objective

Find out possible differences in the following MongoDB queries and understand why one of them works and the other doesn't.

Background

A while ago I posted a questions asking for help regarding a MongoDB query:

  • Using $push with $group with pymongo

In that question my query didn't work, and I was looking for a way to fix it. I had a ton of help in the comments, and eventually found out the solution, but no one seems to be able to explain me why my first incorrect query doesn't work, and the second one does.

Code

1st (incorrect) query:

pipeline = [
        {"$group": {"_id": "$user.screen_name", "tweet_texts": {"$push": "$text"}, "count": {"$sum": 1}}},
        {"$project": {"_id": "$user.screen_name", "count": 1, "tweet_texts": 1}},
        {"$sort" : {"count" : -1}},
        {"$limit": 5}
    ]

2nd query:

pipeline = [ 
        {"$group": {"_id": "$user.screen_name", "tweet_texts": {"$push": "$text"}, "count": {"$sum": 1}}}, 
        {"$sort" : {"count" : -1}}, 
        {"$limit": 5}
    ]

Now, the mindful eye will see that the difference between the two queries is the project stage {"$project": {"_id": "$user.screen_name", "count": 1, "tweet_texts": 1}}.

At the time I thought this stage was necessary, but since I am already selecting the fields I need in the $group stage, I don't really need it. In fact, this additional and unnecessary stage was causing the tests to fail.

Question

If the $project stage in the first example is useless and does the same thing as the $group stage, why was my code failing? Shouldn't it make no difference at all (since the change is idempotent?)


回答1:


In the first query, after the group stage, the user screen name value is saved under the _id key. Not under the user.screen_name key, therefore, that value will not be projected since there is no key.

If you modify your projection, using

{"$project": {"_id": "$_id", "count": 1, "tweet_texts": 1}},

or

{"$project": {"_id": 1, "count": 1, "tweet_texts": 1}},

or

{"$project": {"count": 1, "tweet_texts": 1}},

first pipeline will be similar like second pipeline.



来源:https://stackoverflow.com/questions/40970583/what-is-the-difference-between-these-two-mongodb-queries

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!