I stumbled across pandas and it looks ideal for simple calculations that I\'d like to do. I have a SAS background and was thinking it\'d replace proc freq -- it looks like it\'l
Assuming that you have a file called 2010.csv with contents
category,value
AB,100.00
AB,200.00
AC,150.00
AD,500.00
Then, using the ability to apply multiple aggregation functions following a groupby, you can say:
import pandas
data_2010 = pandas.read_csv("/path/to/2010.csv")
data_2010.groupby("category").agg([len, sum])
You should get a result that looks something like
value
len sum
category
AB 2 300
AC 1 150
AD 1 500
Note that Wes will likely come by to point out that sum is optimized and that you should probably use np.sum.