How to count nan values in a pandas DataFrame?

前端 未结 7 1763
天涯浪人
天涯浪人 2020-12-18 18:40

What is the best way to account for (not a number) nan values in a pandas DataFrame?

The following code:

import numpy as np
import pandas as pd
dfd =         


        
相关标签:
7条回答
  • 2020-12-18 19:28

    This one worked for me best!

    If you wanna get a simple summary use (great for data science to count missing values and their type):

    df.info(verbose=True, null_counts=True)
    

    Or another cool one is:

    df['<column_name>'].value_counts(dropna=False)
    

    Example:

    df = pd.DataFrame({'a': [1, 2, 1, 2, np.nan],
       ...:                    'b': [2, 2, np.nan, 1, np.nan],
       ...:                    'c': [np.nan, 3, np.nan, 3, np.nan]})
    

    This is the df:

        a    b    c
    0  1.0  2.0  NaN
    1  2.0  2.0  3.0
    2  1.0  NaN  NaN
    3  2.0  1.0  3.0
    4  NaN  NaN  NaN
    

    Run Info:

    df.info(verbose=True, null_counts=True)
       ...:
    <class 'pandas.core.frame.DataFrame'>
    
    RangeIndex: 5 entries, 0 to 4
    Data columns (total 3 columns):
    a    4 non-null float64
    b    3 non-null float64
    c    2 non-null float64
    dtypes: float64(3)
    

    So you see for C you get, out of 5 rows 2 non-nulls, b/c you have null at rows: [0,2,4]

    And this is what you get using value_counts for each column:

    In [17]: df['a'].value_counts(dropna=False)
    Out[17]:
     2.0    2
     1.0    2
    NaN     1
    Name: a, dtype: int64
    
    In [18]: df['b'].value_counts(dropna=False)
    Out[18]:
    NaN     2
     2.0    2
     1.0    1
    Name: b, dtype: int64
    
    In [19]: df['c'].value_counts(dropna=False)
    Out[19]:
    NaN     3
     3.0    2
    Name: c, dtype: int64
    
    0 讨论(0)
提交回复
热议问题