Django dumpdata UTF-8 (Unicode)

随声附和 提交于 2019-12-03 11:31:37

问题


Is there a easy way to dump UTF-8 data from a database?

I know this command:

manage.py dumpdata > mydata.json

But the data I got in the file mydata.json, Unicode data looks like:

"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"

I would like to see a real Unicode string like 全球卫星定位系统 (Chinese).


回答1:


django-admin.py dumpdata yourapp could dump for that purpose.

Or if you use MySQL, you could use the mysqldump command to dump the whole database.

And this thread has many ways to dump data, including manual methods.

UPDATE: because OP edited the question.

To convert from JSON encoding string to human readable string you could use this:

open("mydata-new.json","wb").write(open("mydata.json").read().decode("unicode_escape").encode("utf8"))



回答2:


After struggling with similar issues, I've just found, that xml formatter handles UTF8 properly.

manage.py dumpdata --format=xml > output.xml

I had to transfer data from Django 0.96 to Django 1.3. After numerous tries with dump/load data, I've finally succeeded using xml. No side effects for now.

Hope this will help someone, as I've landed at this thread when looking for a solution..




回答3:


You need to either find the call to json.dump*() in the Django code and pass the additional option ensure_ascii=False and then encode the result after, or you need to use json.load*() to load the JSON and then dump it with that option.




回答4:


Here I wrote a snippet for that. Works for me!




回答5:


import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, 'r').read().decode('string-escape')
codecs.open(dst, "wb").write(source)



回答6:


You can create your own serializer which passes ensure_ascii=False argument to json.dumps function:

# serfializers/json_no_uescape.py
from django.core.serializers.json import *


class Serializer(Serializer):

    def _init_options(self):
        super(Serializer, self)._init_options()
        self.json_kwargs['ensure_ascii'] = False

Then register new serializer (for example in your app __init__.py file):

from django.core.serializers import register_serializer

register_serializer('json-no-uescape', 'serializers.json_no_uescape')

Then you can run:

manage.py dumpdata --format=json-no-uescape > output.json




回答7:


just leave it here:

./manage.py dumpdata --indent=2 core.item | python3 -c "import sys; sys.stdout.write(sys.stdin.read().encode().decode('unicode_escape'))" > core/fixtures/item.json


来源:https://stackoverflow.com/questions/2137501/django-dumpdata-utf-8-unicode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!