I\'d like a way to summarise a database table so that rows sharing a common ID are summarised into one row of output.
My tools are SQLite and Python 2.x.
For
The pandas package can handle this very nicely.
>>> import pandas
>>> df=pandas.DataFrame(data, columns=['Fruit', 'Shop', 'Price'])
>>> df.pivot(index='Fruit', columns='Shop', values='Price')
Shop Coles IGA Woolworths
Fruit
Apple 1.5 1.7 1.6
Banana 0.5 0.7 0.6
Cherry 5.0 NaN NaN
Date 2.0 NaN 2.1
Elderberry NaN 10.0 NaN
The documentation: http://pandas.pydata.org/pandas-docs/stable/reshaping.html
Some IPython Notebooks to learn pandas: https://bitbucket.org/hrojas/learn-pandas
Hope that will help.
Regards
Patrick Brockmann
On python side, you could use some itertools magic for rearranging your data:
data = [('Apple', 'Coles', 1.50),
('Apple', 'Woolworths', 1.60),
('Apple', 'IGA', 1.70),
('Banana', 'Coles', 0.50),
('Banana', 'Woolworths', 0.60),
('Banana', 'IGA', 0.70),
('Cherry', 'Coles', 5.00),
('Date', 'Coles', 2.00),
('Date', 'Woolworths', 2.10),
('Elderberry', 'IGA', 10.00)]
from itertools import groupby, islice
from operator import itemgetter
from collections import defaultdict
stores = sorted(set(row[1] for row in data))
# probably splitting this up in multiple lines would be more readable
pivot = ((fruit, defaultdict(lambda: None, (islice(d, 1, None) for d in data))) for fruit, data in groupby(sorted(data), itemgetter(0)))
print 'Fruit'.ljust(12), '\t'.join(stores)
for fruit, prices in pivot:
print fruit.ljust(12), '\t'.join(str(prices[s]) for s in stores)
Output:
Fruit Coles IGA Woolw
Apple 1.5 1.7 1.6
Banana 0.5 0.7 0.6
Cherry 5.0 None None
Date 2.0 None 2.1
Elderberry None 10.0 None