Pandas version of rbind

前端 未结 4 718
野的像风
野的像风 2021-02-01 12:03

In R, you can combine two dataframes by sticking the columns of one onto the bottom of the columns of the other using rbind. In pandas, how do you accomplish the same thing? It

相关标签:
4条回答
  • 2021-02-01 12:13

    Ah, this is to do with how I created the DataFrame, not with how I was combining them. The long and the short of it is, if you are creating a frame using a loop and a statement that looks like this:

    Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData))
    

    You must ignore the index

    Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData), ignore_index=True)
    

    Or you will have issues later when combining data.

    0 讨论(0)
  • 2021-02-01 12:19
    import pandas as pd 
    import numpy as np
    

    If you have a DataFrame like this:

    array = np.random.randint( 0,10, size = (2,4) )
    df = pd.DataFrame(array, columns = ['A','B', 'C', 'D'], \ 
                               index = ['10aa', '20bb'] )  ### some crazy indexes
    df
    
          A  B  C  D
    10aa  4  2  4  6
    20bb  5  1  0  2
    

    And you want add some NEW ROW which is a list (or another iterable object):

    List = [i**3 for i in range(df.shape[1]) ]
    List
    [0, 1, 8, 27]
    

    You should transform list to dictionary with keys equals columns in DataFrame with zip() function:

    Dict = dict(  zip(df.columns, List)  )
    Dict
    {'A': 0, 'B': 1, 'C': 8, 'D': 27}
    

    Than you can use append() method to add new dictionary:

    df = df.append(Dict, ignore_index=True)
    df
        A   B   C   D
    0   7   5   5   4
    1   5   8   4   1
    2   0   1   8   27
    

    N.B. the indexes are droped.

    And yeah, it's not as simple as cbind() in R :(

    0 讨论(0)
  • 2021-02-01 12:24

    pd.concat will serve the purpose of rbind in R.

    import pandas as pd
    df1 = pd.DataFrame({'col1': [1,2], 'col2':[3,4]})
    df2 = pd.DataFrame({'col1': [5,6], 'col2':[7,8]})
    print(df1)
    print(df2)
    print(pd.concat([df1, df2]))
    

    The outcome will looks like:

       col1  col2
    0     1     3
    1     2     4
       col1  col2
    0     5     7
    1     6     8
       col1  col2
    0     1     3
    1     2     4
    0     5     7
    1     6     8
    

    If you read the documentation careful enough, it will also explain other operations like cbind, ..etc.

    0 讨论(0)
  • 2021-02-01 12:34

    This worked for me:

    import numpy as np
    import pandas as pd
    
    dates = np.asarray(pd.date_range('1/1/2000', periods=8))
    df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
    df2 = df1.copy()
    df = df1.append(df2)
    

    Yields:

                       A         B         C         D
    2000-01-01 -0.327208  0.552500  0.862529  0.493109
    2000-01-02  1.039844 -2.141089 -0.781609  1.307600
    2000-01-03 -0.462831  0.066505 -1.698346  1.123174
    2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
    2000-01-05  0.693749  0.544329 -1.606851  0.527733
    2000-01-06 -2.461177 -0.339378 -0.236275  0.155569
    2000-01-07 -0.597156  0.904511  0.369865  0.862504
    2000-01-08 -0.958300 -0.583621 -2.068273  0.539434
    2000-01-01 -0.327208  0.552500  0.862529  0.493109
    2000-01-02  1.039844 -2.141089 -0.781609  1.307600
    2000-01-03 -0.462831  0.066505 -1.698346  1.123174
    2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
    2000-01-05  0.693749  0.544329 -1.606851  0.527733
    2000-01-06 -2.461177 -0.339378 -0.236275  0.155569
    2000-01-07 -0.597156  0.904511  0.369865  0.862504
    2000-01-08 -0.958300 -0.583621 -2.068273  0.539434
    

    If you don't already use the latest version of pandas I highly recommend upgrading. It is now possible to operate with DataFrames which contain duplicate indices.

    0 讨论(0)
提交回复
热议问题