Pandas reading csv files with partial wildcard

后端 未结 4 1719
天涯浪人
天涯浪人 2021-01-05 13:45

I\'m trying to write a script that imports a file, then does something with the file and outputs the result into another file.

df = pd.read_csv(\'somefile2018.

相关标签:
4条回答
  • 2021-01-05 14:29

    Loop over each file and build a list of DataFrame, then assemble them together using concat.

    0 讨论(0)
  • 2021-01-05 14:35

    You can get the list of the CSV files in the script and loop over them.

    from os import listdir
    from os.path import isfile, join
    mypath = os.getcwd()
    
    csvfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) if '.csv' in f]
    
    for f in csvfiles:
        pd.read_csv(f)
    # the rest of your script
    
    0 讨论(0)
  • 2021-01-05 14:40

    glob returns a list, not a string. The read_csv function takes a string as the input to find the file. Try this:

    for f in glob('somefile*.csv'):
        df = pd.read_csv(f)
        ...
        # the rest of your script
    
    0 讨论(0)
  • 2021-01-05 14:40

    To read all of the files that follow a certain pattern, so long as they share the same schema, use this function:

    import glob
    import pandas as pd
    
    def pd_read_pattern(pattern):
        files = glob.glob(pattern)
    
        df = pd.DataFrame()
        for f in files:
            df = df.append(pd.read_csv(f))
    
        return df.reset_index(drop=True)
    
    df = pd_read_pattern('somefile*.csv')
    
    

    This will work with either an absolute or relative path.

    0 讨论(0)
提交回复
热议问题