Find out the percentage of missing values in each column in the given dataset

前端 未结 11 1116
逝去的感伤
逝去的感伤 2021-01-31 08:38
import pandas as pd
df = pd.read_csv(\'https://query.data.world/s/Hfu_PsEuD1Z_yJHmGaxWTxvkz7W_b0\')
percent= 100*(len(df.loc[:,df.isnull().sum(axis=0)>=1 ].index) / l         


        
11条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-31 09:42

    Let's break down your ask

    1. you want the percentage of missing value
    2. it should be sorted in ascending order and the values to be rounded to 2 floating point

    Explanation:

    1. dhr[fill_cols].isnull().sum() - gives the total number of missing values column wise
    2. dhr.shape[0] - gives the total number of rows
    3. (dhr[fill_cols].isnull().sum()/dhr.shape[0]) - gives you a series with percentage as values and column names as index
    4. since the output is a series you can round and sort based on the values

    code:

    (dhr[fill_cols].isnull().sum()/dhr.shape[0]).round(2).sort_values()
    

    Reference: sort, round

提交回复
热议问题