pandas dataframe reshaping/stacking of multiple value variables into seperate columns

前端 未结 3 827
小蘑菇
小蘑菇 2020-12-17 06:01

Hi I\'m trying to reshape a data frame in a certain way.

this is the data frame I have,

         des1 des2 des3 interval1 interval2 interval3
value           


        
相关标签:
3条回答
  • 2020-12-17 06:27

    This might be a shorter approach:

    [72]:
    
    df.columns = pd.MultiIndex.from_tuples(map(lambda x: (x[:-1], x), df.columns))
    In [73]:
    
    print pd.DataFrame({key:df[key].stack().values for key in set(df.columns.get_level_values(0))},
                       index = df['des'].stack().index.get_level_values(0))
          des interval
    value             
    aaa     a      ##1
    aaa     b      ##2
    aaa     c      ##3
    bbb     d      ##4
    bbb     e      ##5
    bbb     f      ##6
    ccc     g      ##7
    ccc     h      ##8
    ccc     i      ##9
    

    Or preserve the 1,2,3 info:

    [73]:
    
    df.columns = pd.MultiIndex.from_tuples(map(lambda x: (x[:-1], x[-1]), df.columns))
    Keys = set(df.columns.get_level_values(0))
    df2  = pd.concat([df[key].stack() for key in Keys], axis=1)
    df2.columns = Keys
    print df2
            des interval
    value               
    aaa   1   a      ##1
          2   b      ##2
          3   c      ##3
    bbb   1   d      ##4
          2   e      ##5
          3   f      ##6
    ccc   1   g      ##7
          2   h      ##8
          3   i      ##9
    
    0 讨论(0)
  • 2020-12-17 06:35

    I think the solution provided by CT Zhu is very genius. But you also can reshape this step by step (maybe this is the common way).

     d = {'des1' : ['', 'a', 'd', 'g'],
         'des2' : ['', 'b', 'e', 'h'],
         'des3' : ['', 'c', 'f', 'i'],
         'interval1' : ['', '##1', '##4', '##7'],
         'interval2' : ['', '##2', '##5', '##6'],
         'interval3' : ['', '##3', '##6', '##9']}
    
    df = pd.DataFrame(d, index=['value', 'aaa', 'bbb', 'ccc'], 
                      columns=['des1', 'des2', 'des3', 'interval1', 'interval2', 'interval3'])
    
    nd = {'des' : [''] + df.iloc[1, 0:3].tolist() + df.iloc[2, 0:3].tolist() + df.iloc[3, 0:3].tolist(),
          'interval' : ['']+ df.iloc[1, 3:6].tolist() + df.iloc[2, 3:6].tolist() + df.iloc[3, 3:6].tolist()}
    
    ndf = pd.DataFrame(nd, index=['value', 'aaa', 'aaa', 'aaa', 'bbb', 'bbb', 'bbb', 'ccc', 'ccc', 'ccc'], columns=['des', 'interval'])
    
    0 讨论(0)
  • 2020-12-17 06:44

    This is just a .melt, docs are here

    In [33]: pd.melt(df.reset_index(),
                     id_vars=['values'],
                     value_vars=['interval1','interval2','interval3'])
    Out[33]: 
      values   variable value
    0    aaa  interval1   ##1
    1    bbb  interval1   ##4
    2    ccc  interval1   ##7
    3    aaa  interval2   ##2
    4    bbb  interval2   ##5
    5    ccc  interval2   ##8
    6    aaa  interval3   ##3
    7    bbb  interval3   ##6
    8    ccc  interval3   ##9
    
    0 讨论(0)
提交回复
热议问题