Read BSON file in Python?

北城余情 提交于 2020-07-18 10:00:09

问题


I want to read a BSON format Mongo dump in Python and process the data. I am using the Python bson package (which I'd prefer to use rather than have a pymongo dependency), but it doesn't explain how to read from a file.

This is what I'm trying:

bson_file = open('statistics.bson', 'rb')
b = bson.loads(bson_file)
print b[0]

But I get:

Traceback (most recent call last):
  File "test.py", line 11, in <module>
    b = bson.loads(bson_file)
  File "/Library/Python/2.7/site-packages/bson/__init__.py", line 75, in loads
    return decode_document(data, 0)[1]
  File "/Library/Python/2.7/site-packages/bson/codec.py", line 235, in decode_document
    length = struct.unpack("<i", data[base:base + 4])[0]
TypeError: 'file' object has no attribute '__getitem__'

What am I doing wrong?


回答1:


The documentation states :

> help(bson.loads)
Given a BSON string, outputs a dict.

You need to pass a string. For example:

> b = bson.loads(bson_file.read())



回答2:


I found this worked for me with a mongodb 2.4 BSON file and python's 'bson' module:

import bson
with open('survey.bson','rb') as f:
    data = bson.decode_all(f.read())

That returned a list of dictionaries matching the JSON documents stored in that mongo collection.

The f.read() data looks like this in a BSON:

>>> rawdata[:100]
'\x04\x01\x00\x00\x12_id\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02_type\x00\x07\x00\x00\x00simple\x00\tchanged\x00\xd0\xbb\xb2\x9eI\x01\x00\x00\tcreated\x00\xd0L\xdcfI\x01\x00\x00\x02description\x00\x14\x00\x00\x00testing the bu'        



回答3:


loads expects a string (that's what the 's' stands for), not a file. Try reading from the file, and passing the result to loads.



来源:https://stackoverflow.com/questions/27527982/read-bson-file-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!