How to Store different language(non english) data in MongoDB Field and retrive the same data?

好久不见. 提交于 2019-12-22 06:16:52

问题


I am tring to store non english(like: Bengali,Hindi) data in a MongoDB field.

This is my approach:-

import pymongo
from pymongo import MongoClient
client = MongoClient()
db = client.testdb

db['testing'].save({'data':'শুভ নববর্ষ'})

I got an Exception. Exception Value: Non-ASCII character '\xe0' in file /test/views.py on line 5, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details (views.py, line 5)

After that I have tried like this:-

from bson import BSON
bson_string = BSON.encode({'data':'শুভ নববর্ষ'})
db['testing'].save({'data':'শুভ নববর্ষ'})

This time also I got the same error.

Edit:- basically i am not able to print 'শুভ নববর্ষ' in IDLE

>>>print 'শুভ নববর্ষ'
Unsupported characters in input

1ST EDIT :-

I have added # -*- coding: utf-8 -*- in my views.py then able to store the data somehow. But this object structure is not same with the normal data structure in mongodb. Like:-

> db['testing'].find()
{ষ" } : ObjectId("52d65a50012bad0b23c13a65"), "data" : "শà§à¦­ নববরà


I have added another record.
>db['testing'].save({'data':'kousik chowdhury'})

Now the collection is looking funny.
> db['testing'].find()                                                           ষ" }
{ "_id" : ObjectId("52d65e6a012bad0a39a2685b"), "data" : "kousik chowdhury" }¦°à§

> db['testing'].find().length()
2

Data Retrive :-

** I am using PuTTY as a editor. 

>>> a = db['testing'].find()[0]
>>> a
{u'_id': ObjectId('52d65a50012bad0b23c13a65'), u'data': u'\u09b6\u09c1\u09ad\u09a8\u09ac\u09ac\u09b0\u09cd\u09b7'}
>>> mydata = a['data']
>>>mydata
u'\u09b6\u09c1\u09ad \u09a8\u09ac\u09ac\u09b0\u09cd\u09b7'
>>>mydata.encode('utf-8')
'\xe0\xa6\xb6\xe0\xa7\x81\xe0\xa6\xad \xe0\xa6\xa8\xe0\xa6\xac\xe0\xa6\xac\xe0\xa6\xb0\xe0\xa7\x8d\xe0\xa6\xb7'

Is there any standard process so that I can store it in mongodb in proper format and get the data back ?


回答1:


Do you have line:

# -*- coding: <encoding name> -*-

on the beginning of your file? For example:

# -*- coding: utf-8 -*-

PART 2:

  • saving data use unicode prefix (u'')

  • assuming you wanted to do a['data'].encode('utf-8') it works correctly - just

    print a['data'].encode('utf-8')

HINT: There is never a good reason to override basic type with some value... (I mean str='')




回答2:


This works for me in iTerm on Mac:

# -*- coding: utf-8 -*-
from pymongo import MongoClient

db = MongoClient().test
db.test_collection.drop()
db.test_collection.save({'data': 'শুভ নববর্ষ'})
document = db.test_collection.find_one()
print document['data']

The printed output matches the input: শুভ নববর্ষ.

MongoDB itself expects all text to be encoded as UTF-8, so it supports all unicode text. The trouble you're having is finding a way to print the output when you retrieve it, in IDLE or anywhere else. Try running your script in the Windows command prompt and see if the output renders correctly there.



来源:https://stackoverflow.com/questions/21085939/how-to-store-different-languagenon-english-data-in-mongodb-field-and-retrive-t

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!