How to iterate through every other document from a mongo db cursor

十年热恋 提交于 2019-12-25 01:44:20

问题


I have a mongo DB cursor with documents that I want to create into Dataframes. However, the documents in that cursor can have a runTime that's too close. Therefore I'd like to get every other document and make a dataframe out of those.

Attempt 1.
all_df_forecast = []
for doc in cursor[::2]:
    single_fc_df = pd.DataFrame(doc['data']['PRICES SPOT'])
    all_df_forecast.append(single_fc_df)

Results in: IndexError: Cursor instances do not support slice steps

Attempt 2.
all_df_forecast = []
for doc in range(0, cursor.count(), 2):
    single_fc_df = pd.DataFrame(doc['data']['PRICES SPOT'])
    all_df_forecast.append(single_fc_df)

Results in TypeError: 'int' object is not subscriptable

Right now this is how the cursor with the documents that have the data.

 cursor = self._collection.find({
   "Type": "f", 
   "runTime": { "$gte": model_dt_from, "$lte": model_dt_till },
   "data.PRICES SPOT.0": { "$exists": True }
 })

Ideally if the cursor could just have every other document based on the query I give it would be ideal. I came accross skip, but from my understanding it skips the beginning with the number I give it. Which is why I am now tackling this after I have the cursor and creating the dataframes for every other document


回答1:


Use cursor.next() to skip over each alternate cursor result.

As a demonstration:

from pymongo import MongoClient

client = MongoClient()
db = client.test

db.pytest.delete_many({})
db.pytest.insert_many([{ 'value': i+1 } for i,x in enumerate([1] * 10)])

cursor = db.pytest.find({},{ '_id': 0 })

count  = cursor.count()
print count
cursor.next()

for doc in cursor:
  print doc
  count -= 2
  print count
  if (count > 0):
    cursor.next()

Would return:

10
{u'value': 2}
8
{u'value': 4}
6
{u'value': 6}
4
{u'value': 8}
2
{u'value': 10}
0

The only thing you need to be aware of when calling cursor.next() is that the cursor actually has remaining results before you call it, otherwise you will raise an exception due to the depleted cursor. For this reason you do something like obtain the cursor.count() and then decrement and track the remaining before you decide to issue.

Note that "odd" numbered results would deplete the cursor before the check anyway, so it's really there to make sure you don't advance the cursor on even numbered results when the remaining documents are 0.

Alternate approaches like you partly attempted are to convert the cursor to a list and then you can grab slices, but that means loading all results into memory, which is probably impractical for most result sets of a reasonable size.



来源:https://stackoverflow.com/questions/50509896/how-to-iterate-through-every-other-document-from-a-mongo-db-cursor

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!