I would like to read several csv files from a directory into pandas and concatenate them into one big DataFrame. I have not been able to figure it out though. Here is what I
Alternative using the pathlib
library (often preferred over os.path
).
This method avoids iterative use of pandas concat()
/apped()
.
From the pandas documentation:
It is worth noting that concat() (and therefore append()) makes a full copy of the data, and that constantly reusing this function can create a significant performance hit. If you need to use the operation over several datasets, use a list comprehension.
import pandas as pd
from pathlib import Path
dir = Path("../relevant_directory")
df = (pd.read_csv(f) for f in dir.glob("*.csv"))
df = pd.concat(df)