问题
I'm using python to merge two files together to create a new one, the data in both files have an id at the start of every string which I want to sort so they're both in the same order and can be merged. To do this I've used .sort() so that they're both arranged in the same order and the comments match the details. However, I'd now like to reorder them so that they go 1, 2, 3, 4... instead of 1, 10, 100, 1000, 1001, 1002 etc but I am having difficulties since the number is the start of a string and python wont convert the first four characters of a string to an integer. If it is any help it is also a tab delimited file and the next piece of information after the id is the date.
Any ideas would be appreciated and ideally I would not like to import any libraries.
My code is:
comments = R'C:\Pythonfile\UFOGB_Comments.txt'
details = R'C:\Pythonfile\UFOGB_Details.txt'
mydest = R'C:\Pythonfile\UFOGB_sorted.txt'
with open(details,'rt') as src:
readdetails = src.readlines()
readdetails.sort()
with open(comments,'rt') as src:
readcomments = src.readlines()
readcomments.sort()
with open(mydest, 'w') as dest:
for i in range(len(readdetails)):
cutcomm = readcomments[i][readcomments[i].find('"'):]
dest.write('{}\t{}'.format(readdetails[i].strip('\n'),cutcomm))
回答1:
You could try to parse the first field as int with:
readdetails.sort(key=lambda x: int(x.split()[0]))
This will work well if all lines are in a consistent format.
Otherwise use a more complex function as a key function for list.sort(), e.g.:
def extract_id(line):
# do something with line
# and return an integer, or another kind of value
and pass it to sort function:
readdetails.sort(key=extract_id)
回答2:
I tried to recreate your data according to your explanation. Tell me if this is correct:
lines = """
123 foobar
1000 foobar
432 foobar
22 foobar
987 foobar
""".strip().split('\n')
print(lines)
lines.sort(key=lambda s: int(s[:4]))
print(lines)
Result:
['123 foobar', '1000 foobar', '432 foobar', '22 foobar', '987 foobar'] # initial
['22 foobar', '123 foobar', '432 foobar', '987 foobar', '1000 foobar'] # final
I suppose that your integer id is limited to 4 digits, as you said in the OP. If the id size is variable you may simply replace the sorting function:
lines.sort(key=lambda s: int(s.split()[0]))
回答3:
If your difficulties relate to sorting a list by the first four characters of each entry try this method from https://wiki.python.org/moin/HowTo/Sorting:
with open(details,'rt') as src:
read_details = src.readlines()
read_details = sorted(read_details, key=lambda detail: detail[:4])
with open(comments,'rt') as src:
read_comments = src.readlines()
read_comments = sorted(read_comments, key=lambda comment: comment[:4])
I'm not entirely sure what you're trying to achieve with the last part - an example of what you have in the comments and details files with an example of what you want an entry to look like in the destination would be useful.
来源:https://stackoverflow.com/questions/49895423/python-sorting-strings-numerically