问题
I'm trying to add column headers with empty values to my dataframe (just like this answer), but within a function that is already modifying it, like so:
mydf = pd.DataFrame()
def myfunc(df):
df['newcol1'] = np.nan # this works
list_of_newcols = ['newcol2', 'newcol3']
df = df.reindex(columns=df.columns.tolist() + list_of_newcols) # this does not
return
myfunc(mydf)
If I run the lines individually in an IPython console, it will add them. But run as a script, newcol1 will be added but 2 and 3 will not. Setting copy=False
does not work either. What am I doing wrong here?
回答1:
Pandas df.reindex() produces a new object unless the indexes are equivalent, so you will need to return the new object from your function.
def myfunc(df):
df['newcol1'] = np.nan # this works
list_of_newcols = ['newcol2', 'newcol3']
df = df.reindex(columns=df.columns.tolist + list_of_newcols) # this does not
return df
mydf = myfunc(mydf)
回答2:
Not sure if this is the mistake you made with the actual code or while you were typing it in here, but the tolist()
is a function and you must add the brackets.
df = df.reindex(columns=df.columns.tolist() + list_of_newcols)
回答3:
You don't need to set NaN
values and specify again new column labels. You can reindex with an arbitrary list of strings; NaN
is the default value where data is not specified.
df = pd.DataFrame({'A': [1, 2, 3]})
df = df.reindex(columns=['A', 'B', 'C'])
print(df)
A B C
0 1 NaN NaN
1 2 NaN NaN
2 3 NaN NaN
来源:https://stackoverflow.com/questions/54220501/how-to-reindex-a-pandas-dataframe-within-a-function