Find out the percentage of missing values in each column in the given dataset

前端 未结 11 1166
逝去的感伤
逝去的感伤 2021-01-31 08:38
import pandas as pd
df = pd.read_csv(\'https://query.data.world/s/Hfu_PsEuD1Z_yJHmGaxWTxvkz7W_b0\')
percent= 100*(len(df.loc[:,df.isnull().sum(axis=0)>=1 ].index) / l         


        
11条回答
  •  -上瘾入骨i
    2021-01-31 09:32

    For me I did it like that :

    def missing_percent(df):
            # Total missing values
            mis_val = df.isnull().sum()
            
            # Percentage of missing values
            mis_percent = 100 * df.isnull().sum() / len(df)
            
            # Make a table with the results
            mis_table = pd.concat([mis_val, mis_percent], axis=1)
            
            # Rename the columns
            mis_columns = mis_table.rename(
            columns = {0 : 'Missing Values', 1 : 'Percent of Total Values'})
            
            # Sort the table by percentage of missing descending
            mis_columns = mis_columns[
                mis_columns.iloc[:,1] != 0].sort_values(
            'Percent of Total Values', ascending=False).round(2)
            
            # Print some summary information
            print ("Your selected dataframe has " + str(df.shape[1]) + " columns.\n"      
                "There are " + str(mis_columns.shape[0]) +
                  " columns that have missing values.")
            
            # Return the dataframe with missing information
            return mis_columns
    

提交回复
热议问题