问题
I have some data frames I am trying to upload to a database. They are lists of values but some of the columns have the string 'null' in them and so this is causing errors.
so I would like to use a function to remove these 'null' strings and am trying to use replace to back fill them below:
df.replace("null", method = bfill)
but it is giving me the error message:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
I have also tried putting "bfill" instead and it just replaced "null" with the string "bfill."
Any help appreciated.
Thanks.
Sorry should have provided an example:
1 6 11
2 7 12
null null null
4 9 14
5 10 15
回答1:
I think you need replace strings null
to NaN
s and then call bfill
(fillna with method='bfill'
) and if some NaN
s in the end of data add ffill
for forward filling:
df = df.replace("null",np.nan).bfill().ffill()
But your error is obviously in read_csv function, check line 4 - parser need only one value and for some reason there are 2 values.
Sample:
df = pd.DataFrame({'A':['k','null','n','null','null','m'],
'B':['t','null','null','f','null','s'],
'C':['r','t','null','s','null','null']})
print (df)
A B C
0 k t r
1 null null t
2 n null null
3 null f s
4 null null null
5 m s null
print (df.replace("null",np.nan))
A B C
0 k t r
1 NaN NaN t
2 n NaN NaN
3 NaN f s
4 NaN NaN NaN
5 m s NaN
df1 = df.replace("null",np.nan).bfill()
print (df1)
A B C
0 k t r
1 n f t
2 n f s
3 m f s
4 m s NaN
5 m s NaN
#if some `NaN`s in last row is necessary `ffill`
df2 = df.replace("null",np.nan).bfill().ffill()
print (df2)
A B C
0 k t r
1 n f t
2 n f s
3 m f s
4 m s s
5 m s s
回答2:
Borrowing @jezrael's sample data set:
In [11]: df[df.ne('null')].bfill().ffill()
Out[11]:
A B C
0 k t r
1 n f t
2 n f s
3 m f s
4 m s s
5 m s s
来源:https://stackoverflow.com/questions/45649492/replacing-an-unknown-number-in-pandas-data-frame-with-previous-number