Is it possible in Python to use Unicode characters as keys for a dictionary? I have Cyrillic words in Unicode that I used as keys. When trying to get a value by a key, I get the
Python 2.x converts both keys to bytestrings when comparing two keys for the purposes of testing whether a key already exists, accessing a value, or overwriting a value. A key can be stored as Unicode, but two distinct Unicode strings cannot both be used as keys if they reduce to identical bytestrings.
In []: d = {'a': 1, u'a': 2}
In []: d
Out[]: {'a': 2}
You can use Unicode keys, in some sense.
Unicode keys are retained in Unicode:
In []: d2 = {u'a': 1}
In []: d2
Out[]: {u'a': 1}
You can access the value with any Unicode string or bytestring that "equals" the existing key:
In []: d2[u'a']
Out[]: 1
In []: d2['a']
Out[]: 1
Using the key or anything that "equals" the key to write a new value will succeed and retain the existing key:
In []: d2['a'] = 5
In []: d2
Out[]: {u'a': 5}
Because comparing 'a'
to an existing key was True
, the value corresponding to that existing Unicode key was replaced with 5
. In the initial example I give, the second key u'a'
provided in the literal for d
compares truthfully to the previously assigned key, so the bytestring 'a'
was retained as the key but the value was overwritten with the 2
.