问题
the data structure is like:
way: {
_id:'9762264'
node: ['253333910', '3304026514']
}
and I'm trying to count the frequency of nodes' appearance in ways. Here is my code using pymongo:
node = db.way.aggregate([
{'$unwind': '$node'},
{
'$group': {
'_id': '$node',
'appear_count': {'$sum': 1}
}
},
{'$sort': {'appear_count': -1}},
{'$limit': 10}
],
{'allowDiskUse': True}
)
it will report an error:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File ".../OSM Wrangling/explore.py", line 78, in most_passed_node
{'allowDiskUse': True}
File ".../pymongo/collection.py", line 2181, in aggregate
**kwargs)
File ".../pymongo/collection.py", line 2088, in _aggregate
client=self.__database.client)
File ".../pymongo/pool.py", line 464, in command
self.validate_session(client, session)
File ".../pymongo/pool.py", line 609, in validate_session
if session._client is not client:
AttributeError: 'dict' object has no attribute '_client'
However, if I removed the {'allowDiskUse': True}
and test it on a smaller set of data, it works well. It seems that the allowDiskUse statement brings something wrong? And there is no information about this mistake in the docs of MongoDB
How should I solve this problem and get the answer I want?
回答1:
How should I solve this problem and get the answer I want?
This is because in PyMongo v3.6 the method signature for collection.aggregate() has been changed. An optional parameter for session
has been added.
The method signature now is :
aggregate(pipeline, session=None, **kwargs)
Applying this to your code example, you can specify allowDiskUse
as below:
node = db.way.aggregate(pipeline=[
{'$unwind': '$node'},
{'$group': {
'_id': '$node',
'appear_count': {'$sum': 1}
}
},
{'$sort': {'appear_count': -1}},
{'$limit': 10}
],
allowDiskUse=True
)
See also pymongo.client_session if you would like to know more about session
.
回答2:
js is case sensitive, please use lowercase boolean true
{'allowDiskUse': true}
来源:https://stackoverflow.com/questions/48616625/mongodb-doesnt-handle-aggregation-with-allowdiskusagetrue