Identifying lists that have 3 elements in common in a lists of lists

前端未结

关注

 3  765

I have a list of lists. If there are subslists that have the first three elements in common , merge them into one list and add all the fourth elements.

The problem i

相关标签:

3条回答

眼角桃花

2021-01-16 09:28

I'd do something like this:

>>> a_list = [['apple', 50, 60, 7],
...           ['orange', 70, 50, 8],
...           ['apple', 50, 60, 12]]
>>> 
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> from operator import itemgetter
>>> getter = itemgetter(0,1,2)
>>> for lst in a_list:
...     d[getter(lst)].extend(lst[3:])
... 
>>> d
defaultdict(<type 'list'>, {('apple', 50, 60): [7, 12], ('orange', 70, 50): [8]})
>>> print [list(k)+v for k,v in d.items()]
[['apple', 50, 60, 7, 12], ['orange', 70, 50, 8]]

This doesn't give the sum however. It could be easily be fixed by doing:

print [list(k)+[sum(v)] for k,v in d.items()]

There isn't much of a reason to prefer this over the slightly more elegant solution by Martijn, other than it will allow the user to have an input list with more than 4 items (with the latter elements being summed as expected). In other words, this would pass the list:

a_list = [['apple', 50, 60, 7, 12],
          ['orange', 70, 50, 8]]

as well.

0 讨论(0)

我在风中等你

2021-01-16 09:31

Form the key from [:3] so that you get the first 3 elements.

0 讨论(0)
发布评论:

提交评论
- 加载中...
忘掉有多难

2021-01-16 09:35
You can use the same principle, by using the first three elements as a key, and using int as the default value factory for the defaultdict (so you get 0 as the initial value):
```
from collections import defaultdict

a_list = [['apple', 50, 60, 7],
          ['orange', 70, 50, 8],
          ['apple', 50, 60, 12]]

d = defaultdict(int)
for sub_list in a_list:
    key = tuple(sub_list[:3])
    d[key] += sub_list[-1]

new_data = [list(k) + [v] for k, v in d.iteritems()]
```
If you are using Python 3, you can simplify this to:
```
d = defaultdict(int)
for *key, v in a_list:
    d[tuple(key)] += v

new_data = [list(k) + [v] for k, v in d.items()]
```
because you can use a starred target to take all 'remaining' values from a list, so each sublist is assigned mostly to key and the last value is assigned to v, making the loop just that little simpler (and there is no .iteritems() method on a dict in Python 3, because .items() is an iterator already).

So, we use a defaultdict that uses 0 as the default value, then for each key generated from the first 3 values (as a tuple so you can use it as a dictionary key) sum the last value.
- So for the first item ['apple', 50, 60, 7] we create a key ('apple', 50, 60), look that up in d (where it doesn't exist, but defaultdict will then use int() to create a new value of 0), and add the 7 from that first item.
- Do the same for the ('orange', 70, 50) key and value 8.
- for the 3rd item we get the ('apple', 50, 60) key again and add 12 to the pre-existing 7 in d[('apple', 50, 60)]. for a total of 19.
Then we turn the (key, value) pairs back into lists and you are done. This results in:
```
>>> new_data
[['apple', 50, 60, 19], ['orange', 70, 50, 8]]
```
An alternative implementation that requires sorting the data uses itertools.groupby:
```
from itertools import groupby
from operator import itemgetter

a_list = [['apple', 50, 60, 7],
          ['orange', 70, 50, 8],
          ['apple', 50, 60, 12]]

newlist = [list(key) + [sum(i[-1] for i in sublists)] 
    for key, sublists in groupby(sorted(a_list), key=itemgetter(0, 1, 2))]
```
for the same output. This is going to be slower if your data isn't sorted, but it's good to know of different approaches.
0 讨论(0)
发布评论:

提交评论
- 加载中...