The following code is ignoring the locale and Égypt goes at the end, what\'s wrong?
dict = {\"United States\": \"United States\", \"Spain\" : \"Spain\", \"Englan
Here's a work-around.
Use unicode's normalization form canonical decomposition http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms
# utf-8 <-> unicode is left as exercise to the reader
egypt = unicodedata.normalize("NFD", egypt)
sorted(['Egypt', 'E\xcc\x81gypt', 'US'])
['Egypt', 'E\xcc\x81gypt', 'US']
This doesn't actually take locale into consideration.
Beyond this, try newer Python (yes I know) or ICU library from Martijn's linked question and respective answers.