I am attempting to update the first N rows in a multi-index dataframe but was having a bit of trouble finding a solution so thought I\'d create a post for it.
The exampl
How about this - first define a function that takes a dataframe, and replaces the first x records with a specified value.
def replace_first_x(group_df, x, value):
group_df.iloc[:x, :] = value
return group_df
Then, pass that into the groupby
object with apply.
In [97]: df.groupby(level=0).apply(lambda df: replace_first_x(df, 2, 9999))
Out[97]:
A B C D
CATEGORY DATE
A 2000-01-01 9999.000000 9999.000000 9999.000000 9999.000000
2000-01-03 9999.000000 9999.000000 9999.000000 9999.000000
2000-01-05 1.590503 0.948911 -0.268071 0.622280
2000-01-07 -0.493866 1.222231 0.125037 0.071064
B 2000-01-02 9999.000000 9999.000000 9999.000000 9999.000000
2000-01-04 9999.000000 9999.000000 9999.000000 9999.000000
2000-01-06 1.663430 -1.170716 2.044815 -2.081035
2000-01-08 1.593104 0.108531 -1.381218 -0.517312
Typically, whenever you have to change values, rather then just pick them, you cannot proceed using a lambda
function only, since these only allow selection.
A very boiled down way to proceed is
def replace_first(group):
group.iloc[0:2] = 99
return group
and then just do
In[144]: df.groupby(level=0).apply(replace_first)
Out[144]:
A B C D
CATEGORY DATE
A 2000-01-01 99.000000 99.000000 99.000000 99.000000
2000-01-03 99.000000 99.000000 99.000000 99.000000
2000-01-05 0.458031 1.959409 0.622295 0.959019
2000-01-07 0.934521 -2.016685 1.046456 1.489070
B 2000-01-02 99.000000 99.000000 99.000000 99.000000
2000-01-04 99.000000 99.000000 99.000000 99.000000
2000-01-06 -0.117322 -1.664436 1.582124 0.486796
2000-01-08 -0.225379 0.794846 -0.021214 -0.510768