Here's a function that, given a string with a mixture of alphabetical and numeric parts, returns a tuple that will sort in a "natural" way.
def naturalkey(key, convert=int):
if not key:
return ()
keys = []
start = 0
extra = ""
in_num = key[0].isdigit()
for i, char in enumerate(key):
if start < i:
if in_num:
try:
last_num = convert(key[start:i])
except:
in_num = False
if i > 2 and key[i-2] == ".":
extra = "."
keys.append(last_num)
start = i-1
if not in_num: # this is NOT equivalent to `else`!
if char.isdigit():
keys.append(extra + key[start:i])
in_num = True
start = i
extra = ""
last_num = convert(char)
keys.append(last_num if in_num else (extra + key[start:]))
return tuple(keys)
The basic approach it uses is, when it sees a digit, it gathers additional characters and keeps trying to convert the result to a number until it can't anymore (i.e. it gets an exception). By default it tries to convert runs of characters to an integer, but you can pass in convert=float
to have it accept decimal points. (It won't accept scientific notation, unfortunately, since to get something like '1e3' it would first try to parse '1e' which is invalid. This, along with the + or - sign, could be special-cased but it doesn't look like that is necessary for your use case.)
The function returns a tuple containing strings and numbers in the order they were found in the string, with the numbers parsed to the specified numeric type. For example:
naturalkey("foobar2000.exe")
>>> ("foobar", 2000, ".exe")
This tuple can be used as a key for sorting a list of strings:
my_list.sort(key=lambda i: naturalkey(i, float))
Or you can use it to implement a comparison function:
def __lt__(self, other):
return naturalkey(self.value, float) < naturalkey(other.value, float)
It would be better (faster) to generate the natural key in the object's __init__()
method, store it in the instance, and write your comparison function(s) to use the stored value instead. If the value from which the key is derived is mutable, you could write a property that updates the key when the underlying value is updated.