I have the following data structure (a list of lists)
[
[\'4\', \'21\', \'1\', \'14\', \'2008-10-24 15:42:58\'],
[\'3\', \'22\', \'4\', \'2somename\', \'2
For the first question, the first thing you should do is sort the list by the second field using itemgetter from the operator module:
x = [
['4', '21', '1', '14', '2008-10-24 15:42:58'],
['3', '22', '4', '2somename', '2008-10-24 15:22:03'],
['5', '21', '3', '19', '2008-10-24 15:45:45'],
['6', '21', '1', '1somename', '2008-10-24 15:45:49'],
['7', '22', '3', '2somename', '2008-10-24 15:45:51']
]
from operator import itemgetter
x.sort(key=itemgetter(1))
Then you can use itertools' groupby function:
from itertools import groupby
y = groupby(x, itemgetter(1))
Now y is an iterator containing tuples of (element, item iterator). It's more confusing to explain these tuples than it is to show code:
for elt, items in groupby(x, itemgetter(1)):
print(elt, items)
for i in items:
print(i)
Which prints:
21 <itertools._grouper object at 0x511a0>
['4', '21', '1', '14', '2008-10-24 15:42:58']
['5', '21', '3', '19', '2008-10-24 15:45:45']
['6', '21', '1', '1somename', '2008-10-24 15:45:49']
22 <itertools._grouper object at 0x51170>
['3', '22', '4', '2somename', '2008-10-24 15:22:03']
['7', '22', '3', '2somename', '2008-10-24 15:45:51']
For the second part, you should use list comprehensions as mentioned already here:
from pprint import pprint as pp
pp([y for y in x if y[3] == '2somename'])
Which prints:
[['3', '22', '4', '2somename', '2008-10-24 15:22:03'],
['7', '22', '3', '2somename', '2008-10-24 15:45:51']]
If you'll be doing a lot of sorting and filtering, you may like some helper functions.
m = [
['4', '21', '1', '14', '2008-10-24 15:42:58'],
['3', '22', '4', '2somename', '2008-10-24 15:22:03'],
['5', '21', '3', '19', '2008-10-24 15:45:45'],
['6', '21', '1', '1somename', '2008-10-24 15:45:49'],
['7', '22', '3', '2somename', '2008-10-24 15:45:51']
]
# Sort and filter helpers.
sort_on = lambda pos: lambda x: x[pos]
filter_on = lambda pos,val: lambda l: l[pos] == val
# Sort by second column
m = sorted(m, key=sort_on(1))
# Filter on 4th column, where value = '2somename'
m = filter(filter_on(3,'2somename'),m)