I am trying to create a new variable which counts how many times had been seen the same id over time.
Need to pass from this dataframe
id clae6 y
By using cumcount
df.groupby('id').cumcount().add(1)
Out[1574]:
0 1
1 2
2 3
3 4
4 5
5 6
6 1
7 2
8 3
9 4
10 5
11 1
12 2
13 3
14 4
dtype: int64
You can use rank
df['new'] = df.groupby('id').rank(method = 'first').astype(int)
id clae6 year quarter new
0 1 475230.0 2007 1 1
1 1 475230.0 2007 2 2
2 1 475230.0 2007 3 3
3 1 475230.0 2007 4 4
4 1 475230.0 2008 1 5
5 1 475230.0 2008 2 6
6 2 475230.0 2007 1 1
7 2 475230.0 2007 2 2
8 2 475230.0 2007 3 3
9 2 475230.0 2007 4 4
10 2 475230.0 2008 1 5
11 3 475230.0 2010 1 1
12 3 475230.0 2010 2 2
13 3 475230.0 2010 3 3
14 3 475230.0 2010 4 4