How to query for distinct results in mongodb with python?

a 夏天 提交于 2019-11-28 04:03:54

问题


I have a mongo collection with multiple documents, suppose the following (assume Tom had two teachers for History in 2012 for whatever reason)

{
"name" : "Tom"
"year" : 2012
"class" : "History"
"Teacher" : "Forester"
}

{
"name" : "Tom"
"year" : 2011
"class" : "Math"
"Teacher" : "Sumpra"
}


{
"name" : "Tom",
"year" : 2012,
"class" : "History",
"Teacher" : "Reiser"
}

I want to be able to query for all the distinct classes "Tom" has ever had, even though Tom has had multiple "History" classes with multiple teachers, I just want the query to get the minimal number of documents such that Tom is in all of them, and "History" shows up one time, as opposed to having a query result that contains multiple documents with "History" repeated.

I took a look at: http://mongoengine-odm.readthedocs.org/en/latest/guide/querying.html

and want to be able to try something like:

student_users = Students.objects(name = "Tom", class = "some way to say distinct?")

Though it does not appear to be documented. If this is not the syntactically correct way to do it, is this possible in mongoengine, or is there some way to accomplish with some other library like pymongo? Or do i have to query for all documents with Tom then do some post-processing to get to unique values? Syntax would be appreciated for any case.


回答1:


First of all, it's only possible to get distinct values on some field (only one field) as explained in MongoDB documentation on Distinct.

Mongoengine's QuerySet class does support distinct() method to do the job.

So you might try something like this to get results:

Students.objects(name="Tom").distinct(field="class")

This query results in one BSON-document containing list of classes Tom attends.

Attention Note that returned value is a single document, so if it exceeds max document size (16 MB), you'll get error and in that case you have to switch to map/reduce approach to solve such kind of problems.




回答2:


import pymongo
posts = pymongo.MongoClient('localhost', 27017)['db']['colection']


res = posts.find({ "geography": { "$regex": '/europe/', "$options": 'i'}}).distinct('geography')
print type(res)
res.sort()
for line in res:
    print line

refer to http://docs.mongodb.org/manual/reference/method/db.collection.distinct/ distinct returns a list , will be printed on print type(res) , you can sort a list with res.sort() , after that it will print the values of the sorted list.

Also you can query posts before select distinct values .




回答3:


student_users = Students.objects(name = "Tom").distinct('class')


来源:https://stackoverflow.com/questions/12007726/how-to-query-for-distinct-results-in-mongodb-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!