Replacing an unknown number in Pandas data frame with previous number

非 Y 不嫁゛ 提交于 2021-02-16 20:22:25

问题


I have some data frames I am trying to upload to a database. They are lists of values but some of the columns have the string 'null' in them and so this is causing errors.

so I would like to use a function to remove these 'null' strings and am trying to use replace to back fill them below:

df.replace("null", method = bfill)

but it is giving me the error message:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2

I have also tried putting "bfill" instead and it just replaced "null" with the string "bfill."

Any help appreciated.

Thanks.

Sorry should have provided an example:

1     6     11
2     7     12
null  null  null
4     9     14
5     10    15

回答1:


I think you need replace strings null to NaNs and then call bfill (fillna with method='bfill') and if some NaNs in the end of data add ffill for forward filling:

df = df.replace("null",np.nan).bfill().ffill()

But your error is obviously in read_csv function, check line 4 - parser need only one value and for some reason there are 2 values.

Sample:

df = pd.DataFrame({'A':['k','null','n','null','null','m'],
                   'B':['t','null','null','f','null','s'],
                   'C':['r','t','null','s','null','null']})

print (df)
      A     B     C
0     k     t     r
1  null  null     t
2     n  null  null
3  null     f     s
4  null  null  null
5     m     s  null

print (df.replace("null",np.nan))
     A    B    C
0    k    t    r
1  NaN  NaN    t
2    n  NaN  NaN
3  NaN    f    s
4  NaN  NaN  NaN
5    m    s  NaN
df1 = df.replace("null",np.nan).bfill()
print (df1)
   A  B    C
0  k  t    r
1  n  f    t
2  n  f    s
3  m  f    s
4  m  s  NaN
5  m  s  NaN

#if some `NaN`s in last row is necessary `ffill`
df2 = df.replace("null",np.nan).bfill().ffill()
print (df2)
   A  B  C
0  k  t  r
1  n  f  t
2  n  f  s
3  m  f  s
4  m  s  s
5  m  s  s



回答2:


Borrowing @jezrael's sample data set:

In [11]: df[df.ne('null')].bfill().ffill()
Out[11]:
   A  B  C
0  k  t  r
1  n  f  t
2  n  f  s
3  m  f  s
4  m  s  s
5  m  s  s


来源:https://stackoverflow.com/questions/45649492/replacing-an-unknown-number-in-pandas-data-frame-with-previous-number

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!