I have a CSV file that goes something like this:
[\'Name1\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\', \'\',
You should use itertools.groupby:
t = [
['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'],
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''],
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
]
from itertools import groupby
# TODO: if you need to speed things up you can use operator.itemgetter
# for both sorting and grouping
for name, rows in groupby(sorted(t), lambda x:x[0]):
print join_rows(rows)
It's obvious that you'd implement the merging in a separate function. For example like this:
def join_rows(rows):
def join_tuple(tup):
for x in tup:
if x:
return x
else:
return ''
return [join_tuple(x) for x in zip(*rows)]
You can also use defaultdict
:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> _ = [d[i[0]].append(z) for i in t for z in i[1:]]
>>> d['Name1']
['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
Then do your column joining
def merge_rows(row1, row2):
# merge two rows with the same name
merged_row = ...
return merged_row
r1 = ['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+']
r2 = ['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
r3 = ['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '']
r4 = ['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
rows = [r1, r2, r3, r4]
data = {}
for row in rows:
name = row[0]
if name in data:
data[name] = merge_rows(row, data[name])
else:
data[name] = row
You now have all the rows in data
where each key of this dictionary is the name and the corresponding value is that row. You can now write this data to a CSV file.