Really simple task in Pandas is throwing an error I don\'t understand. With a simple dataset like this:
test=pd.DataFrame([[1,3],[1,6],[2,4],[3,9],[3,2]],column
You didn't select any columns to perform the aggregation on so it did it on the remaining columns which are 2, if you select one of the columns then you get the desired result:
In [6]:
newtest['blocks'] = newtest.groupby(['CountyFP','District','state_x'])['BlockID'].transform('count')
newtest
Out[6]:
BlockID CountyFP District state_x HD blocks
0 010010201001000 001 0220 AL 0 3
1 010010201001001 001 0220 AL 0 3
2 010010201001002 001 0220 AL 0 3
3 010010201001003 001 0160 AL 0 2
4 010010201001004 001 0160 AL 0 2
output of your attempt:
In [9]:
newtest.groupby(['CountyFP','District','state_x']).transform('count')
Out[9]:
BlockID HD
0 3 3
1 3 3
2 3 3
3 2 2
4 2 2
You can see that it generates 2 columns as these are the remaining columns hence the error message you observed.