Saving numpy array in mongodb

后端 未结 5 1721
醉话见心
醉话见心 2020-12-23 12:01

I have a couple of MongoDB documents wherein one my the fields is best represented as a matrix (numpy array). I would like to save this document to MongoDB, how do I do this

相关标签:
5条回答
  • 2020-12-23 12:30

    Have you try MongoWrapper, i think it simple :

    Declare connection to mongodb server and collection to save your np.

    import monogowrapper as mdb
    db = mdb.MongoWrapper(dbName='test',
                          collectionName='test_collection', 
                          hostname="localhost", 
                          port="27017") 
    my_dict = {"name": "Important experiment", 
                "data":np.random.random((100,100))}
    

    The dictionary's just as you'd expect it to be:

    print my_dict
    {'data': array([[ 0.773217,  0.517796,  0.209353, ...,  0.042116,  0.845194,
             0.733732],
           [ 0.281073,  0.182046,  0.453265, ...,  0.873993,  0.361292,
             0.551493],
           [ 0.678787,  0.650591,  0.370826, ...,  0.494303,  0.39029 ,
             0.521739],
           ..., 
           [ 0.854548,  0.075026,  0.498936, ...,  0.043457,  0.282203,
             0.359131],
           [ 0.099201,  0.211464,  0.739155, ...,  0.796278,  0.645168,
             0.975352],
           [ 0.94907 ,  0.363454,  0.912208, ...,  0.480943,  0.810243,
             0.217947]]),
     'name': 'Important experiment'}
    

    Save data to mongo :

    db.save(my_dict)
    

    To load back data :

    my_loaded_dict = db.load({"name":"Important experiment"})
    
    0 讨论(0)
  • 2020-12-23 12:35

    For a 1D numpy array, you can use lists:

    # serialize 1D array x
    record['feature1'] = x.tolist()
    
    # deserialize 1D array x
    x = np.fromiter( record['feature1'] )
    

    For multidimensional data, I believe you'll need to use pickle and pymongo.binary.Binary:

    # serialize 2D array y
    record['feature2'] = pymongo.binary.Binary( pickle.dumps( y, protocol=2) ) )
    
    # deserialize 2D array y
    y = pickle.loads( record['feature2'] )
    
    0 讨论(0)
  • 2020-12-23 12:47

    The code pymongo.binary.Binary(...) didnt work for me, may be we need to use bson as @tcaswell suggested.

    Anyway here is one solution for multi-dimensional numpy array

    >>from bson.binary import Binary
    >>import pickle
    # convert numpy array to Binary, store record in mongodb
    >>record['feature2'] = Binary(pickle.dumps(npArray, protocol=2), subtype=128 )
    # get record from mongodb, convert Binary to numpy array
    >> npArray = pickle.loads(record['feature2'])
    

    Having said that, the credit goes to MongoWrapper used the code written by them.

    0 讨论(0)
  • 2020-12-23 12:49

    We've built an open source library for storing numeric data (Pandas, numpy, etc.) in MongoDB:

    https://github.com/manahl/arctic

    Best of all it's really easy to use, pretty fast and supports data versioning, multiple data libraries and more.

    0 讨论(0)
  • 2020-12-23 12:51

    Have you tried Monary?

    They have examples on the site

    http://djcinnovations.com/index.php/archives/103

    0 讨论(0)
提交回复
热议问题