Pandas - find first non-null value in column

前端 未结 3 1415
清酒与你
清酒与你 2020-12-14 06:39

If I have a series that has either NULL or some non-null value. How can I find the 1st row where the value is not NULL so I can report back the datatype to the user. If th

相关标签:
3条回答
  • 2020-12-14 07:20

    For a series this will return the first no null value:

    Creating Series s:

    s = pd.Series(index=[2,4,5,6], data=[None, None, 2, None])
    

    which creates this Series:

    2    NaN
    4    NaN
    5    2.0
    6    NaN
    dtype: float64
    

    You can get the first non-NaN value by using:

    s.loc[~s.isnull()].iloc[0]
    

    which returns

    2.0
    

    If you on the other hand have a dataframe like this one:

    df = pd.DataFrame(index=[2,4,5,6], data=np.asarray([[None, None, 2, None], [1, None, 3, 4]]).transpose(), 
                      columns=['a', 'b'])
    

    which looks like this:

        a       b
    2   None    1
    4   None    None
    5   2       3
    6   None    4
    

    you can select per column the first non null value using this (for column a):

    df.a.loc[~df.a.isnull()].iloc[0]
    

    or if you want the first row containing no Null values anywhere you can use:

    df.loc[~df.isnull().sum(1).astype(bool)].iloc[0]
    

    Which returns:

    a    2
    b    3
    Name: 5, dtype: object
    
    0 讨论(0)
  • 2020-12-14 07:28

    You can use first_valid_index with select by loc:

    s = pd.Series([np.nan,2,np.nan])
    print (s)
    0    NaN
    1    2.0
    2    NaN
    dtype: float64
    
    print (s.first_valid_index())
    1
    
    print (s.loc[s.first_valid_index()])
    2.0
    
    # If your Series contains ALL NaNs, you'll need to check as follows:
    
    s = pd.Series([np.nan, np.nan, np.nan])
    idx = s.first_valid_index()  # Will return None
    first_valid_value = s.loc[idx] if idx is not None else None
    print(first_valid_value)
    None
    
    0 讨论(0)
  • 2020-12-14 07:42

    You can also use get method instead

    (Pdb) type(audio_col)
    <class 'pandas.core.series.Series'>
    (Pdb) audio_col.first_valid_index()
    19
    (Pdb) audio_col.get(first_audio_idx)
    'first-not-nan-value.ogg'
    
    0 讨论(0)
提交回复
热议问题