I have several csv files in several zip files in on folder, so for example:
Use zip.namelist()
to get list of files inside the zip
Ex:
import glob
import zipfile
import pandas as pd
for zip_file in glob.glob("C/folder/*.zip"):
zf = zipfile.ZipFile(zip_file)
dfs = [pd.read_csv(zf.open(f), header=None, sep=";") for f in zf.namelist()]
df = pd.concat(dfs,ignore_index=True)
print(df)
I would try to tackle it in two passes. First pass, extract the contents of the zipfile onto the filesystem. Second Pass, read all those extracted CSVs using the method you already have above:
import glob
import pandas as pd
import zipfile
def extract_files(file_path):
archive = zipfile.ZipFile(file_path, 'r')
unzipped_path = archive.extractall()
return unzipped_path
zipped_files = glob.glob("C/folder/*.zip")]
file_paths = [extract_files(zf) for zf in zipped_files]
dfs = [pd.read_csv(f, header=None, sep=";") for f in file_paths]
df = pd.concat(dfs,ignore_index=True)