Dictionary column in pandas dataframe

后端 未结 3 1806
伪装坚强ぢ
伪装坚强ぢ 2020-12-24 00:51

I\'ve got a csv that I\'m reading into a pandas dataframe. However one of the columns is in the form of a dictionary. Here is an example:

ColA, ColB, ColC, C         


        
相关标签:
3条回答
  • 2020-12-24 01:08

    What about something like:

    import pandas as pd
    
    # Create mock dataframe
    df = pd.DataFrame([
        [20, 30, {'ab':1, 'we':2, 'as':3}, 'String1'],
        [21, 31, {'ab':4, 'we':5, 'as':6}, 'String2'],
        [22, 32, {'ab':7, 'we':8, 'as':9}, 'String2'],
    ], columns=['Col A', 'Col B', 'Col C', 'Col D'])
    
    # Create dataframe where you'll store the dictionary values
    ddf = pd.DataFrame(columns=['AB','WE','AS'])
    
    # Populate ddf dataframe
    for (i,r) in df.iterrows():
        e = r['Col C']
        ddf.loc[i] = [e['ab'], e['we'], e['as']]
    
    # Replace df with the output of concat(df, ddf)
    df = pd.concat([df, ddf], axis=1)
    
    # New column order, also drops old Col C column
    df = df[['Col A', 'Col B', 'AB', 'WE', 'AS', 'Col D']]
    
    print(df)
    

    Output:

       Col A  Col B  AB  WE  AS    Col D
    0     20     30   1   2   3  String1
    1     21     31   4   5   6  String2
    2     22     32   7   8   9  String2
    
    0 讨论(0)
  • 2020-12-24 01:09

    So starting with your one row df

        Col A   Col B   Col C                           Col D
    0   20      30      {u'we': 2, u'ab': 1, u'as': 3}  String1
    

    EDIT: based on the comment by the OP, I'm assuming we need to convert the string first

    import ast
    df["ColC"] =  df["ColC"].map(lambda d : ast.literal_eval(d))
    

    then we convert Col C to a dict, transpose it and then join it to the original df

    dfNew = df.join(pd.DataFrame(df["Col C"].to_dict()).T)
    dfNew
    

    which gives you this

        Col A   Col B   Col C                           Col D   ab  as  we
    0   20      30      {u'we': 2, u'ab': 1, u'as': 3}  String1 1   3   2
    

    Then we just select the columns we want in dfNew

    dfNew[["Col A", "Col B", "ab", "we", "as", "Col D"]]
    
        Col A   Col B   ab  we  as  Col D
    0   20      30      1   2   3   String1
    
    0 讨论(0)
  • As per https://stackoverflow.com/a/38231651/454773, you can use .apply(pd.Series) to map the dict containing column onto new columns and then concatenate these new columns back into the original dataframe minus the original dict containing column:

    dw=pd.DataFrame( [[20, 30, {"ab":"1", "we":"2", "as":"3"},"String"]],
                    columns=['ColA', 'ColB', 'ColC', 'ColdD'])
    pd.concat([dw.drop(['ColC'], axis=1), dw['ColC'].apply(pd.Series)], axis=1)
    

    Returns:

    ColA    ColB    ColdD   ab  as  we
    20      30      String  1   3   2
    
    0 讨论(0)
提交回复
热议问题