I have multiple csv files that I would like to combine into one df.
They are all in this general format, with two index columns:
A simple way:
Creating a list with the names of csvs:
files=listdir()
csvs=list()
for file in files:
if file.endswith(".csv"):
csvs.append(file)
concatenate the csvs:
data=pd.DataFrame()
for i in csvs:
table=pd.read_csv(i)
data=pd.concat([data,table])
I think you need concat instead merge
:
df = pd.concat([pd.read_csv(f, index_col=[0,1]) for f in files])
You can try the following. I made some changes to the DataFrame combining logic
import os
import pandas as pd
import glob
files = glob.glob(r'2017-12-05\Aggregated\*.csv') //folder which contains all the csv files
df = reduce(lambda df1,df2: pd.merge(df1,df2,on='id',how='outer'),[pd.read_csv(f, index_col=[0,1])for f in files] )
df.to_csv(r'\merged.csv')