i have a tuple like this
[
(379146591, \'it\', 55, 1, 1, \'NON ENTRARE\', \'NonEntrate\', 55, 1),
(4746004, \'it\', 28, 2, 2, \'NON ENTRARE\', \'NonEntrate\
Use itertools.groupby (and operator.itemgetter to get the first item). The only thing is that your data needs to already be sorted so that the groups appear one after the other (if you've used the uniq
and sort
bash commands, same idea), you can use sorted() for this
import operator
from itertools import groupby
data = [
(379146591, "it", 55, 1, 1, "NON ENTRARE", "NonEntrate", 55, 1),
(4746004, "it", 28, 2, 2, "NON ENTRARE", "NonEntrate", 26, 2),
(4746004, "it", 28, 2, 2, "TheBestTroll Group", "TheBestTrollGroup", 2, 3),
]
data = sorted(data, key=operator.itemgetter(0)) # this might be unnecessary
for k, g in groupby(data, operator.itemgetter(0)):
print(k, list(g))
Will output
4746004 [(4746004, 'it', 28, 2, 2, 'NON ENTRARE', 'NonEntrate', 26, 2), (4746004, 'it', 28, 2, 2, 'TheBestTroll Group', 'TheBestTrollGroup', 2, 3)]
379146591 [(379146591, 'it', 55, 1, 1, 'NON ENTRARE', 'NonEntrate', 55, 1)]
In your case, you also need to remove the first element from your lists of values. Change the last two lines of the above to:
for k, g in groupby(data, operator.itemgetter(0)):
print(k, [item[1:] for item in g])
Output:
4746004 [('it', 28, 2, 2, 'NON ENTRARE', 'NonEntrate', 26, 2), ('it', 28, 2, 2, 'TheBestTroll Group', 'TheBestTrollGroup', 2, 3)]
379146591 [('it', 55, 1, 1, 'NON ENTRARE', 'NonEntrate', 55, 1)]