I have the following dataframe:
fsq digits digits_type
0 1 1 odd
1 2 1 odd
2 3 1 odd
3 11 2 even
4 22 2
In [395]: df['count'] = df.groupby('digits')['fsq'].transform(len)
In [396]: df
Out[396]:
fsq digits digits_type count
0 1 1 odd 3
1 2 1 odd 3
2 3 1 odd 3
3 11 2 even 2
4 22 2 even 2
5 101 3 odd 2
6 111 3 odd 2
[7 rows x 4 columns]
In general, you should use Pandas-defined methods, where possible. This will often be more efficient.
In this case you can use 'size'
, in the same vein as df.groupby('digits')['fsq'].size()
:
df = pd.concat([df]*10000)
%timeit df.groupby('digits')['fsq'].transform('size') # 3.44 ms per loop
%timeit df.groupby('digits')['fsq'].transform(len) # 11.6 ms per loop