I want to be able to append df1 df2, df3 into one df_All , but since each of the dataframe has different column. How could I do this in for loop ( I have others stuff that i
You can also use set comprehension to join all common columns from an arbitrary list of DataFrames:
df_list = [df1, df2, df3]
common_cols = list(set.intersection(*(set(c) for c in df_list)))
df_new = pd.concat([df[common_cols] for df in df_list], ignore_index=True)
>>> df_new
A B
0 1 4
1 2 5
2 3 6
3 8 5
4 9 6
5 10 7
6 1 4
7 2 5
8 3 7
If I understand what you want then you need to select just columns 'A' and 'B' from df3
and then use pd.concat :
In [35]:
df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])])
df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10])])
df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7])])
df_list = [df1,df2,df3[['A','B']]]
pd.concat(df_list, ignore_index=True)
Out[35]:
A B
0 1 4
1 2 5
2 3 6
3 8 5
4 9 6
5 10 7
6 1 4
7 2 5
8 3 7
Note that in your original code this is poor practice:
list = ['df1','df2','df3']
This shadows the built in type list
plus even if it was actually a valid var name like df_list
you've created a list of strings and not a list of dfs.
If you want to determine the common columns then you can determine this using the np.intersection
method on the columns:
In [39]:
common_cols = df1.columns.intersection(df2.columns).intersection(df3.columns)
common_cols
Out[39]:
Index(['A', 'B'], dtype='object')