I have a list
[[0.5, 2], [0.5, 5], [2, 3], [2, 6], [2, 0.6], [7, 1]]
I require the output from summing the second element in each sublist f
Will this work?
L = [[0.5, 2], [0.5, 5], [2, 3], [2, 6], [2, 0.6], [7, 1]]
nums = []
d = {}
for lst in L:
if lst[0] not in d:
d[lst[0]] = []
nums.append(lst[0])
d[lst[0]].append(lst[1])
for key in nums:
print [key, sum(d[key])]
Output:
[0.5, 7]
[2, 9.6]
[7, 1]
Using Pandas, you can retain the original 'order' of the data:
pairs = [[0.5, 2], [0.5, 5], [2, 3], [2, 6], [2, 0.6], [7, 1]]
df = pd.DataFrame(pairs)
>>> [tup[0] for tup in zip(df.groupby(0, sort=False, as_index=False).sum().values.tolist())]
[[0.5, 7.0], [2.0, 9.6], [7.0, 1.0]]
Accumulate with a defaultdict:
>>> from collections import defaultdict
>>> data = defaultdict(int)
>>> L = [[0.5, 2], [0.5, 5], [2, 3], [2, 6], [2, 0.6], [7, 1]]
>>> for k, v in L:
... data[k] += v
...
>>> [[k,v] for (k,v) in data.items()]
[[0.5, 7], [2, 9.6], [7, 1]]
Note that the value for 2 was automatically "promoted" to a float by addition, even though this is a defaultdict of int. This is to match the desired output posted in the question, but I think you should consider to use homogeneous output types rather than a mix of int and float.
You can get away with sorting and itertools.groupby:
from operator import itemgetter
from itertools import groupby
data = [[0.5, 2], [0.5, 5], [2, 3], [2, 6], [2, 0.6], [7, 1]]
key = itemgetter(0)
data.sort(key=key) # Use data = sorted(data, key=key) to avoid clobbering
result = [[k, sum(group)] for k, group in groupby(data, key)]
This will not preserve the original order of the keys.