How to add pandas data to an existing csv file?

后端 未结 6 1361
离开以前
离开以前 2020-11-22 10:02

I want to know if it is possible to use the pandas to_csv() function to add a dataframe to an existing csv file. The csv file has the same structure as the load

相关标签:
6条回答
  • 2020-11-22 10:10

    You can append to a csv by opening the file in append mode:

    with open('my_csv.csv', 'a') as f:
        df.to_csv(f, header=False)
    

    If this was your csv, foo.csv:

    ,A,B,C
    0,1,2,3
    1,4,5,6
    

    If you read that and then append, for example, df + 6:

    In [1]: df = pd.read_csv('foo.csv', index_col=0)
    
    In [2]: df
    Out[2]:
       A  B  C
    0  1  2  3
    1  4  5  6
    
    In [3]: df + 6
    Out[3]:
        A   B   C
    0   7   8   9
    1  10  11  12
    
    In [4]: with open('foo.csv', 'a') as f:
                 (df + 6).to_csv(f, header=False)
    

    foo.csv becomes:

    ,A,B,C
    0,1,2,3
    1,4,5,6
    0,7,8,9
    1,10,11,12
    
    0 讨论(0)
  • with open(filename, 'a') as f:
        df.to_csv(f, header=f.tell()==0)
    
    • Create file unless exists, otherwise append
    • Add header if file is being created, otherwise skip it
    0 讨论(0)
  • 2020-11-22 10:20

    Initially starting with a pyspark dataframes - I got type conversion errors (when converting to pandas df's and then appending to csv) given the schema/column types in my pyspark dataframes

    Solved the problem by forcing all columns in each df to be of type string and then appending this to csv as follows:

    with open('testAppend.csv', 'a') as f:
        df2.toPandas().astype(str).to_csv(f, header=False)
    
    0 讨论(0)
  • 2020-11-22 10:20

    A bit late to the party but you can also use a context manager, if you're opening and closing your file multiple times, or logging data, statistics, etc.

    from contextlib import contextmanager
    import pandas as pd
    @contextmanager
    def open_file(path, mode):
         file_to=open(path,mode)
         yield file_to
         file_to.close()
    
    
    ##later
    saved_df=pd.DataFrame(data)
    with open_file('yourcsv.csv','r') as infile:
          saved_df.to_csv('yourcsv.csv',mode='a',header=False)`
    
    0 讨论(0)
  • 2020-11-22 10:26

    You can specify a python write mode in the pandas to_csv function. For append it is 'a'.

    In your case:

    df.to_csv('my_csv.csv', mode='a', header=False)
    

    The default mode is 'w'.

    0 讨论(0)
  • 2020-11-22 10:27

    A little helper function I use with some header checking safeguards to handle it all:

    def appendDFToCSV_void(df, csvFilePath, sep=","):
        import os
        if not os.path.isfile(csvFilePath):
            df.to_csv(csvFilePath, mode='a', index=False, sep=sep)
        elif len(df.columns) != len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns):
            raise Exception("Columns do not match!! Dataframe has " + str(len(df.columns)) + " columns. CSV file has " + str(len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns)) + " columns.")
        elif not (df.columns == pd.read_csv(csvFilePath, nrows=1, sep=sep).columns).all():
            raise Exception("Columns and column order of dataframe and csv file do not match!!")
        else:
            df.to_csv(csvFilePath, mode='a', index=False, sep=sep, header=False)
    
    0 讨论(0)
提交回复
热议问题