Error: float object has no attribute notnull

前端未结

关注

 4  928

I have a dataframe:

  a     b     c
0 nan   Y     nan
1  23   N      3
2 nan   N      2
3  44   Y     nan

I wish to have this output:

相关标签:

4条回答

无人共我

2021-02-18 19:43

You don't need apply, use np.where:

df['d'] = np.where(df.a.isnull(),
         np.nan,
         np.where((df.b == "N")&(~df.c.isnull()),
                  df.a*df.c,
                  df.a))

Output:

      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0

0 讨论(0)

梦如初夏

2021-02-18 19:46
Use
```
pd.isnull(df['Description'][i])
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

无人共我

2021-02-18 19:51

You can try

df['d'] = np.where((df.b == 'N') & (pd.notnull(df.c)), df.a*df.c, np.where(pd.notnull(df.a), df.a, np.nan))


    a       b   c      d
0   NaN     Y   NaN    NaN
1   23.0    N   3.0    69.0
2   NaN     N   2.0    NaN
3   44.0    Y   NaN    44.0

See the documentation for pandas notnull, in your current code, you just need to change series.notnull to pd.notnull(series) for it to work. Though np.where should be more efficient

def f4(row):
    if row['a']==np.nan:
        return np.nan
    elif (row['b']=="N") & (pd.notnull(row.c)):
        return row['a']*row['c']
    else:
        return row['a']
df['d']=df.apply(f4,axis=1)

0 讨论(0)

闹比i

2021-02-18 20:03

Since you just want Nans to be propagated, multiplying the columns takes care of that for you:

>>> df = pd.read_clipboard()
>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.a * df.c
0     NaN
1    69.0
2     NaN
3     NaN
dtype: float64
>>>

If you want to do it on a condition, you can use np.where here instead of .apply. all you need is the following:

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

This is the default behavior for most operations involving Nan. So, you can simply assign the result of the above:

>>> df['d'] = np.where(df.b == 'N', df.a*df.c, df.a)
>>> df
      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

Just to elaborate on what this:

np.where(df.b == 'N', df.a*df.c, df.a)

Is doing, you can think of it as "where df.b == 'N', give me the result of df.a * df.c, else, give me just df.a:

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

Also note, if your dataframe were a little different:

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  Y  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.loc[0,'a'] = 99
>>> df.loc[0, 'b']= 'N'
>>> df
      a  b    c
0  99.0  N  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN

Then the following would not be equivalent:

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])
>>> np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
array([ 99.,  69.,  nan,  44.])

So you might want to use the slightly more verbose:

>>> df['d'] = np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
>>> df
      a  b    c     d
0  99.0  N  NaN  99.0
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

0 讨论(0)