How to count nan values in a pandas DataFrame?

前端 未结 7 1761
天涯浪人
天涯浪人 2020-12-18 18:40

What is the best way to account for (not a number) nan values in a pandas DataFrame?

The following code:

import numpy as np
import pandas as pd
dfd =         


        
7条回答
  •  有刺的猬
    2020-12-18 19:25

    Yet another way to count all the nans in a df:

    num_nans = df.size - df.count().sum()

    Timings:

    import timeit
    
    import numpy as np
    import pandas as pd
    
    df_scale = 100000
    df = pd.DataFrame(
        [[1, np.nan, 100, 63], [2, np.nan, 101, 63], [2, 12, 102, 63],
         [2, 14, 102, 63], [2, 14, 102, 64], [1, np.nan, 200, 63]] * df_scale,
        columns=['group', 'value', 'value2', 'dummy'])
    
    repeat = 3
    numbers = 100
    
    setup = """import pandas as pd
    from __main__ import df
    """
    
    def timer(statement, _setup=None):
        print (min(
            timeit.Timer(statement, setup=_setup or setup).repeat(
                repeat, numbers)))
    
    timer('df.size - df.count().sum()')
    timer('df.isna().sum().sum()')
    timer('df.isnull().sum().sum()')
    

    prints:

    3.998805362999999
    3.7503365439999996
    3.689461442999999
    

    so pretty much equivalent

提交回复
热议问题