Excel file overwritten instead of concat - Python - Pandas

后端未结

关注

 3  1155

I\'m trying to contact all excel files and worksheets in them into one using the below script. It kinda works but then the excel file c.xlsx is overwritten per file, so only

相关标签:

3条回答

不知归路

2021-01-16 05:50

I got it working using the below script which uses @ryguy72's answer but works on all worksheets as well as the header row.

import pandas as pd
import numpy as np
import glob

all_data = pd.DataFrame()
for f in glob.glob("my_path/*.xlsx"):
    df = pd.read_excel(f, sheet_name=None, ignore_index=True)
    cdf = pd.concat(df.values())
    all_data = all_data.append(cdf,ignore_index=True)
print(all_data)
df = pd.DataFrame(all_data)
df.shape
df.to_excel("my_path/final.xlsx", sheet_name='Sheet1')

0 讨论(0)

情歌与酒

2021-01-16 05:56

I just tested the code below. It merges data from all Excel files in a folder into one, single, Excel file.

import pandas as pd
import numpy as np

import glob
glob.glob("C:\\your_path\\*.xlsx")

all_data = pd.DataFrame()
for f in glob.glob("C:\\your_path\\*.xlsx"):
    df = pd.read_excel(f)
    all_data = all_data.append(df,ignore_index=True)
print(all_data)
df = pd.DataFrame(all_data)
df.shape
df.to_excel("C:\\your_path\\final.xlsx", sheet_name='Sheet1')

0 讨论(0)

生来不讨喜

2021-01-16 06:05

Idea is create list of DataFrames in list comprehension, but because working with orderdict is necessary concat in loop and then again concat for one big final DataFrame:

cdf = [pd.read_excel(excel_names, sheet_name=None, ignore_index=True).values() 
       for excel_names in glob.glob('files/*.xlsx')]

df = pd.concat([pd.concat(x) for x in cdf], ignore_index=True)
#print (df)

df.to_excel("c.xlsx", index=False)

0 讨论(0)