Pandas version of rbind

前端未结

关注

 4  727

In R, you can combine two dataframes by sticking the columns of one onto the bottom of the columns of the other using rbind. In pandas, how do you accomplish the same thing? It

相关标签:

4条回答

离开以前

2021-02-01 12:13
Ah, this is to do with how I created the DataFrame, not with how I was combining them. The long and the short of it is, if you are creating a frame using a loop and a statement that looks like this:
```
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData))
```
You must ignore the index
```
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData), ignore_index=True)
```
Or you will have issues later when combining data.
0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2021-02-01 12:19

import pandas as pd 
import numpy as np

If you have a DataFrame like this:

array = np.random.randint( 0,10, size = (2,4) )
df = pd.DataFrame(array, columns = ['A','B', 'C', 'D'], \ 
                           index = ['10aa', '20bb'] )  ### some crazy indexes
df

      A  B  C  D
10aa  4  2  4  6
20bb  5  1  0  2

And you want add some NEW ROW which is a list (or another iterable object):

List = [i**3 for i in range(df.shape[1]) ]
List
[0, 1, 8, 27]

You should transform list to dictionary with keys equals columns in DataFrame with zip() function:

Dict = dict(  zip(df.columns, List)  )
Dict
{'A': 0, 'B': 1, 'C': 8, 'D': 27}

Than you can use append() method to add new dictionary:

df = df.append(Dict, ignore_index=True)
df
    A   B   C   D
0   7   5   5   4
1   5   8   4   1
2   0   1   8   27

N.B. the indexes are droped.

And yeah, it's not as simple as cbind() in R :(

0 讨论(0)

旧时难觅i

2021-02-01 12:24

pd.concat will serve the purpose of rbind in R.

import pandas as pd
df1 = pd.DataFrame({'col1': [1,2], 'col2':[3,4]})
df2 = pd.DataFrame({'col1': [5,6], 'col2':[7,8]})
print(df1)
print(df2)
print(pd.concat([df1, df2]))

The outcome will looks like:

   col1  col2
0     1     3
1     2     4
   col1  col2
0     5     7
1     6     8
   col1  col2
0     1     3
1     2     4
0     5     7
1     6     8

If you read the documentation careful enough, it will also explain other operations like cbind, ..etc.

0 讨论(0)

南笙

2021-02-01 12:34

This worked for me:

import numpy as np
import pandas as pd

dates = np.asarray(pd.date_range('1/1/2000', periods=8))
df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df2 = df1.copy()
df = df1.append(df2)

Yields:

                   A         B         C         D
2000-01-01 -0.327208  0.552500  0.862529  0.493109
2000-01-02  1.039844 -2.141089 -0.781609  1.307600
2000-01-03 -0.462831  0.066505 -1.698346  1.123174
2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
2000-01-05  0.693749  0.544329 -1.606851  0.527733
2000-01-06 -2.461177 -0.339378 -0.236275  0.155569
2000-01-07 -0.597156  0.904511  0.369865  0.862504
2000-01-08 -0.958300 -0.583621 -2.068273  0.539434
2000-01-01 -0.327208  0.552500  0.862529  0.493109
2000-01-02  1.039844 -2.141089 -0.781609  1.307600
2000-01-03 -0.462831  0.066505 -1.698346  1.123174
2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
2000-01-05  0.693749  0.544329 -1.606851  0.527733
2000-01-06 -2.461177 -0.339378 -0.236275  0.155569
2000-01-07 -0.597156  0.904511  0.369865  0.862504
2000-01-08 -0.958300 -0.583621 -2.068273  0.539434

If you don't already use the latest version of pandas I highly recommend upgrading. It is now possible to operate with DataFrames which contain duplicate indices.

0 讨论(0)