I\'m wondering how Python does string comparison, more specifically how it determines the outcome when a less than (<
) or greater than (>
) op
A pure Python equivalent for string comparisons would be:
def less(string1, string2):
# Compare character by character
for idx in range(min(len(string1), len(string2))):
# Get the "value" of the character
ordinal1, ordinal2 = ord(string1[idx]), ord(string2[idx])
# If the "value" is identical check the next characters
if ordinal1 == ordinal2:
continue
# It's not equal so we're finished at this index and can evaluate which is smaller.
else:
return ordinal1 < ordinal2
# We're out of characters and all were equal, so the result depends on the length
# of the strings.
return len(string1) < len(string2)
This function does the equivalent of the real method (Python 3.6 and Python 2.7) just a lot slower. Also note that the implementation isn't exactly "pythonic" and only works for <
comparisons. It's just to illustrate how it works. I haven't checked if it works like Pythons comparison for combined unicode characters.
A more general variant would be:
from operator import lt, gt
def compare(string1, string2, less=True):
op = lt if less else gt
for char1, char2 in zip(string1, string2):
ordinal1, ordinal2 = ord(char1), ord(char1)
if ordinal1 == ordinal2:
continue
else:
return op(ordinal1, ordinal2)
return op(len(string1), len(string2))