Pandas: Multiple columns into one column

后端 未结 4 1859
眼角桃花
眼角桃花 2020-11-29 05:38

I have the following data (2 columns, 4 rows):

Column 1: A, B, C, D

Column 2: E, F, G, H

I am attempting to combine the columns into one c

相关标签:
4条回答
  • 2020-11-29 05:57

    The trick is to use stack()

    df.stack().reset_index()
    
       level_0   level_1  0
    0        0  Column 1  A
    1        0  Column 2  E
    2        1  Column 1  B
    3        1  Column 2  F
    4        2  Column 1  C
    5        2  Column 2  G
    6        3  Column 1  D
    7        3  Column 2  H
    
    0 讨论(0)
  • 2020-11-29 06:06

    You can flatten the values in column direction using ravel, is much faster.

    In [1238]: df
    Out[1238]:
      Column 1 Column 2
    0        A        E
    1        B        F
    2        C        G
    3        D        H
    
    In [1239]: pd.Series(df.values.ravel('F'))
    Out[1239]:
    0    A
    1    B
    2    C
    3    D
    4    E
    5    F
    6    G
    7    H
    dtype: object
    

    Details

    Medium

    In [1245]: df.shape
    Out[1245]: (4000, 2)
    
    In [1246]: %timeit pd.Series(df.values.ravel('F'))
    10000 loops, best of 3: 86.2 µs per loop
    
    In [1247]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
    1000 loops, best of 3: 816 µs per loop
    

    Large

    In [1249]: df.shape
    Out[1249]: (40000, 2)
    
    In [1250]: %timeit pd.Series(df.values.ravel('F'))
    10000 loops, best of 3: 87.5 µs per loop
    
    In [1251]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
    100 loops, best of 3: 1.72 ms per loop
    
    0 讨论(0)
  • 2020-11-29 06:07

    Update

    pandas has a built in method for this stack which does what you want see the other answer.

    This was my first answer before I knew about stack many years ago:

    In [227]:
    
    df = pd.DataFrame({'Column 1':['A', 'B', 'C', 'D'],'Column 2':['E', 'F', 'G', 'H']})
    df
    Out[227]:
      Column 1 Column 2
    0        A        E
    1        B        F
    2        C        G
    3        D        H
    
    [4 rows x 2 columns]
    
    In [228]:
    
    df['Column 1'].append(df['Column 2']).reset_index(drop=True)
    Out[228]:
    0    A
    1    B
    2    C
    3    D
    4    E
    5    F
    6    G
    7    H
    dtype: object
    
    0 讨论(0)
  • 2020-11-29 06:17

    What you appear to be asking is simply for help on creating another view of your data. If there is no reason those data are in two columns in the first place then just create one column. If however you need to combine them for presentation in some other tool you can do something like:

    import itertools as it, pandas as pd
    df = pd.DataFrame({1:['a','b','c','d'],2:['e','f','g','h']})
    sorted(it.chain(*df.values))
    # -> ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
    
    0 讨论(0)
提交回复
热议问题