I\'m using Python 2 to parse JSON from ASCII encoded text files.
When loading these files with either json or simplejson, all my
That's because json has no difference between string objects and unicode objects. They're all strings in javascript.
I think JSON is right to return unicode objects. In fact, I wouldn't accept anything less, since javascript strings are in fact unicode
objects (i.e. JSON (javascript) strings can store any kind of unicode character) so it makes sense to create unicode
objects when translating strings from JSON. Plain strings just wouldn't fit since the library would have to guess the encoding you want.
It's better to use unicode
string objects everywhere. So your best option is to update your libraries so they can deal with unicode objects.
But if you really want bytestrings, just encode the results to the encoding of your choice:
>>> nl = json.loads(js)
>>> nl
[u'a', u'b']
>>> nl = [s.encode('utf-8') for s in nl]
>>> nl
['a', 'b']