问题
I have a list that contains a lot of sublists. i.e.
mylst = [[1, 343, 407, 433, 27],
[1, 344, 413, 744, 302],
[1, 344, 500, 600, 100],
[1, 344, 752, 1114, 363],
[1, 345, 755, 922, 168],
[2, 345, 188, 1093, 906],
[2, 346, 4, 950, 947],
[2, 346, 953, 995, 43],
[3, 346, 967, 1084, 118],
[3, 347, 4, 951, 948],
[3, 347, 1053, 1086, 34],
[3, 349, 1049, 1125, 77],
[3, 349, 1004, 1124, 120],
[3, 350, 185, 986, 802],
[3, 352, 1018, 1055, 38]]
I want to start categorizing this list firstly and making another list by using three steps. First of all, I want to compare sublists when the first item in each sublist is the same, i.e mylist[a][0]==1. Secondly, comparing second item in sublists, and if difference between the second item in the sublist and another second item in the following sulbists under 2, then calculate the difference between third items or fourth items. If either of the difference for third and fourth item is under 10, then I want to append index of the sublist.
The result that I want should be... like this : [0, 1, 3, 4, 6, 7, 10, 11, 12]
Following is my naive attempts to do this.
Following is my naive attempts to do this.
def seg(mylist) :
Segments = []
for a in range(len(mylist)-1) :
for index, value in enumerate (mylist) :
if mylist[a][0] == 1 :
if abs(mylist[a][1] - mylist[a+1][1]) <= 2 :
if (abs(mylist[a][2] - mylist[a+1][2]) <= 10 or
abs(mylist[a][3] - mylist[a+1][3]) <= 10) :
Segments.append(index)
return Segments
or
def seg(mylist) :
Segments= []
for index, value in enumerate(mylist) :
for a in range(len(mylist)-1) :
if mylist[a][0] == 1 :
try :
if abs(mylist[a][1]-mylist[a+1][1]) <= 2 :
if (abs(mylist[a][2]-mylist[a+1][2]) <= 10 or
abs(mylist[a][3] - mylist[a+1][3]) <= 10) :
Segments.append(index)
except IndexError :
if abs(mylist[a][1]-mylist[a+1][1]) <= 2 :
if (abs(mylist[a][2]-mylist[a+1][2]) <= 10 or
abs(mylist[a][3] - mylist[a+1][3]) <= 10):
Segments.append(index)
return Segments
These codes don't look nice at all, and result are not showing as that I intended to. In the bottom one, I wrote try and except to handle index error(list out of range), initially I used 'while' iteration instead of 'for' iteration.
What should I do to get result that I wanted to? How can I correct those codes to look like more 'pythonic' way? Any idea would be great for me, and many thanks in advance.
回答1:
You will have to catch the duplicate indexes but this should be a lot more efficient:
gr = []
it = iter(mylst)
prev = next(it)
for ind, ele in enumerate(it):
if ele[0] == prev[0] and abs(ele[1] - prev[1]) <= 2:
if any(abs(ele[i] - prev[i]) < 10 for i in (2, 3)):
gr.extend((ind, ind+1))
prev = ele
Based on your logic 6 and 7 should not appear as they don't meet the criteria:
[2, 346, 953, 995, 43],
[3, 346, 967, 1084, 118],
Also for 10 to appear it should be <= 2
not < 2
as per your description.
You could use an OrderedDict to remove the dupes and keep the order:
from collections import OrderedDict
print(OrderedDict.fromkeys(gr).keys())
[0, 1, 3, 4, 10, 11, 12]
回答2:
This seems to have worked for me. I'm not sure if its more Pythonic in any way though and you'll be looping through the list multiple times so there's some things you can definitely do to optimize it more.
def seg(mylist):
# converted list to set in case there are any duplicates
segments = set()
for entry_index in range(len(mylist)):
for c in range(len(mylist)):
first = mylist[entry_index]
comparison = mylist[c]
# ignore comparing the same items
if entry_index == c:
continue
# ignore cases where the first item does not match
if first[0] != comparison[0]:
continue
# ignore cases where the second item differs by more than 2
if abs(first[1] - comparison[1]) > 2:
continue
# add cases where the third and fourth items differ by less than 10
if abs(first[2] - comparison[2]) < 10 or abs(first[3] - comparison[3]) < 10:
segments.add(entry_index)
elif abs(first[2] - comparison[3]) < 10 or abs(first[3] - comparison[2]) < 10:
segments.add(entry_index)
return segments
来源:https://stackoverflow.com/questions/30674108/python-comparison-sublists-and-making-a-list