问题
I have a mongo DB cursor with documents that I want to create into Dataframes. However, the documents in that cursor can have a runTime
that's too close. Therefore I'd like to get every other document and make a dataframe out of those.
all_df_forecast = []
for doc in cursor[::2]:
single_fc_df = pd.DataFrame(doc['data']['PRICES SPOT'])
all_df_forecast.append(single_fc_df)
Results in: IndexError: Cursor instances do not support slice steps
all_df_forecast = []
for doc in range(0, cursor.count(), 2):
single_fc_df = pd.DataFrame(doc['data']['PRICES SPOT'])
all_df_forecast.append(single_fc_df)
Results in TypeError: 'int' object is not subscriptable
Right now this is how the cursor with the documents that have the data.
cursor = self._collection.find({
"Type": "f",
"runTime": { "$gte": model_dt_from, "$lte": model_dt_till },
"data.PRICES SPOT.0": { "$exists": True }
})
Ideally if the cursor could just have every other document based on the query I give it would be ideal. I came accross skip, but from my understanding it skips the beginning with the number I give it. Which is why I am now tackling this after I have the cursor and creating the dataframes for every other document
回答1:
Use cursor.next() to skip over each alternate cursor result.
As a demonstration:
from pymongo import MongoClient
client = MongoClient()
db = client.test
db.pytest.delete_many({})
db.pytest.insert_many([{ 'value': i+1 } for i,x in enumerate([1] * 10)])
cursor = db.pytest.find({},{ '_id': 0 })
count = cursor.count()
print count
cursor.next()
for doc in cursor:
print doc
count -= 2
print count
if (count > 0):
cursor.next()
Would return:
10
{u'value': 2}
8
{u'value': 4}
6
{u'value': 6}
4
{u'value': 8}
2
{u'value': 10}
0
The only thing you need to be aware of when calling cursor.next() is that the cursor actually has remaining results before you call it, otherwise you will raise an exception due to the depleted cursor. For this reason you do something like obtain the cursor.count() and then decrement and track the remaining before you decide to issue.
Note that "odd" numbered results would deplete the cursor before the check anyway, so it's really there to make sure you don't advance the cursor on even numbered results when the remaining documents are 0
.
Alternate approaches like you partly attempted are to convert the cursor to a list
and then you can grab slices, but that means loading all results into memory, which is probably impractical for most result sets of a reasonable size.
来源:https://stackoverflow.com/questions/50509896/how-to-iterate-through-every-other-document-from-a-mongo-db-cursor