Excel file overwritten instead of concat - Python - Pandas

后端 未结 3 1155
情话喂你
情话喂你 2021-01-16 05:44

I\'m trying to contact all excel files and worksheets in them into one using the below script. It kinda works but then the excel file c.xlsx is overwritten per file, so only

相关标签:
3条回答
  • 2021-01-16 05:50

    I got it working using the below script which uses @ryguy72's answer but works on all worksheets as well as the header row.

    import pandas as pd
    import numpy as np
    import glob
    
    all_data = pd.DataFrame()
    for f in glob.glob("my_path/*.xlsx"):
        df = pd.read_excel(f, sheet_name=None, ignore_index=True)
        cdf = pd.concat(df.values())
        all_data = all_data.append(cdf,ignore_index=True)
    print(all_data)
    df = pd.DataFrame(all_data)
    df.shape
    df.to_excel("my_path/final.xlsx", sheet_name='Sheet1')
    
    0 讨论(0)
  • 2021-01-16 05:56

    I just tested the code below. It merges data from all Excel files in a folder into one, single, Excel file.

    import pandas as pd
    import numpy as np
    
    import glob
    glob.glob("C:\\your_path\\*.xlsx")
    
    all_data = pd.DataFrame()
    for f in glob.glob("C:\\your_path\\*.xlsx"):
        df = pd.read_excel(f)
        all_data = all_data.append(df,ignore_index=True)
    print(all_data)
    df = pd.DataFrame(all_data)
    df.shape
    df.to_excel("C:\\your_path\\final.xlsx", sheet_name='Sheet1')
    
    0 讨论(0)
  • 2021-01-16 06:05

    Idea is create list of DataFrames in list comprehension, but because working with orderdict is necessary concat in loop and then again concat for one big final DataFrame:

    cdf = [pd.read_excel(excel_names, sheet_name=None, ignore_index=True).values() 
           for excel_names in glob.glob('files/*.xlsx')]
    
    df = pd.concat([pd.concat(x) for x in cdf], ignore_index=True)
    #print (df)
    
    df.to_excel("c.xlsx", index=False)
    
    0 讨论(0)
提交回复
热议问题