Pandas: IndexingError: Unalignable boolean Series provided as indexer

前端 未结 2 1788
攒了一身酷
攒了一身酷 2020-12-09 08:58

I\'m trying to run what I think is simple code to eliminate any columns with all NaNs, but can\'t get this to work (axis = 1 works just fine when eliminating ro

相关标签:
2条回答
  • 2020-12-09 09:28

    You can use dropna with axis=1 and thresh=1:

    In[19]:
    df.dropna(axis=1, thresh=1)
    
    Out[19]: 
         a    b    c
    0  1.0  4.0  NaN
    1  2.0  NaN  8.0
    2  NaN  6.0  9.0
    3  NaN  NaN  NaN
    

    This will drop any column which doesn't have at least 1 non-NaN value which will mean any column with all NaN will get dropped

    The reason what you tried failed is because the boolean mask:

    In[20]:
    df.notnull().any(axis = 0)
    
    Out[20]: 
    a     True
    b     True
    c     True
    d    False
    dtype: bool
    

    cannot be aligned on the index which is what is used by default, as this produces a boolean mask on the columns

    0 讨论(0)
  • 2020-12-09 09:29

    You need loc, because filter by columns:

    print (df.notnull().any(axis = 0))
    a     True
    b     True
    c     True
    d    False
    dtype: bool
    
    df = df.loc[:, df.notnull().any(axis = 0)]
    print (df)
    
         a    b    c
    0  1.0  4.0  NaN
    1  2.0  NaN  8.0
    2  NaN  6.0  9.0
    3  NaN  NaN  NaN
    

    Or filter columns and then select by []:

    print (df.columns[df.notnull().any(axis = 0)])
    Index(['a', 'b', 'c'], dtype='object')
    
    df = df[df.columns[df.notnull().any(axis = 0)]]
    print (df)
    
         a    b    c
    0  1.0  4.0  NaN
    1  2.0  NaN  8.0
    2  NaN  6.0  9.0
    3  NaN  NaN  NaN
    

    Or dropna with parameter how='all' for remove all columns filled by NaNs only:

    print (df.dropna(axis=1, how='all'))
         a    b    c
    0  1.0  4.0  NaN
    1  2.0  NaN  8.0
    2  NaN  6.0  9.0
    3  NaN  NaN  NaN
    
    0 讨论(0)
提交回复
热议问题