Take the following data-frame:
x = np.tile(np.arange(3),3)
y = np.repeat(np.arange(3),3)
df = pd.DataFrame({\"x\": x, \"y\": y})
You can set new indices by using set_index:
df2.set_index(np.arange(len(df2.index)))
Output:
x y
0 0 0
1 0 1
2 0 2
3 1 0
4 1 1
5 1 2
6 2 0
7 2 1
8 2 2
Since pandas 1.0.0 df.sort_values has a new parameter ignore_index
which does exactly what you need:
In [1]: df2 = df.sort_values(by=['x','y'],ignore_index=True)
In [2]: df2
Out[2]:
x y
0 0 0
1 0 1
2 0 2
3 1 0
4 1 1
5 1 2
6 2 0
7 2 1
8 2 2
df.sort()
is deprecated, use df.sort_values(...)
: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html
Then follow joris' answer by doing df.reset_index(drop=True)
You can reset the index using reset_index to get back a default index of 0, 1, 2, ..., n-1 (and use drop=True
to indicate you want to drop the existing index instead of adding it as an additional column to your dataframe):
In [19]: df2 = df2.reset_index(drop=True)
In [20]: df2
Out[20]:
x y
0 0 0
1 0 1
2 0 2
3 1 0
4 1 1
5 1 2
6 2 0
7 2 1
8 2 2