Create dictionary from results of DataFrame in pandas

前端 未结 5 1444
旧时难觅i
旧时难觅i 2021-01-16 16:35

I have a dataframe with results as below. Sample dataframe shown actual one is much larger. I want to get a dictionary (or another structure if it will be faster) with the

相关标签:
5条回答
  • 2021-01-16 16:55

    You can do boolean indexing on the dataframe columns in a dictionary comprehension.

    >>> {idx: df.columns[row].tolist() for idx, row in df.notnull().iterrows()}
    {1: ['MSFT'], 2: ['GOOG', 'AMZN'], 3: ['AAPL', 'AMZN', 'FB'], 4: ['FB']}
    
    0 讨论(0)
  • 2021-01-16 17:03

    You can get the dot product of mask and columns and then use string operations i.e

    df.notna().dot(df.columns+',').str.strip(',').str.split(',').to_dict()
    
    {1: ['MSFT'], 2: ['GOOG', 'AMZN'], 3: ['AAPL', 'AMZN', 'FB'], 4: ['FB']}
    
    0 讨论(0)
  • 2021-01-16 17:03
    df.stack().reset_index(level=1).groupby(level=0).level_1.apply(list).to_dict()
    Out[764]: {1: ['MSFT'], 2: ['GOOG', 'AMZN'], 3: ['AAPL', 'AMZN', 'FB'], 4: ['FB']}
    
    0 讨论(0)
  • 2021-01-16 17:09

    You can use .apply

    df.apply(lambda x: list(x.dropna().index), axis=1).to_dict()       #Updated answer
    # Or dict(df.apply(lambda x: list(x.index[~x.isnull()]), axis=1))  #Original answer
    

    Output:

    {1: ['MSFT'], 2: ['GOOG', 'AMZN'], 3: ['AAPL', 'AMZN', 'FB'], 4: ['FB']}
    
    0 讨论(0)
  • 2021-01-16 17:12

    Maybe not the best in terms of performance, but you could use iterrows:

    import numpy as np
    results = {}
    for i, row in df.iterrows():
        results[i] = list(df.columns[~np.isnan(row)])
    
    0 讨论(0)
提交回复
热议问题