What's the fastest way to merge multiple csv files by column?

后端 未结 5 817
长情又很酷
长情又很酷 2021-02-09 04:08

I have about 50 CSV files with 60,000 rows in each, and a varying number of columns. I want to merge all the CSV files by column. I\'ve tried doing this in MATLAB by transposing

5条回答
  •  面向向阳花
    2021-02-09 05:05

    import csv
    import itertools
    
    # put files in the order you want concatentated
    csv_names = [...whatever...] 
    
    readers = [csv.reader(open(fn, 'rb')) for fn in csv_names]
    writer = csv.writer(open('result.csv', 'wb'))
    
    for row_chunks in itertools.izip(*readers):
        writer.writerow(list(itertools.chain.from_iterable(row_chunks)))
    

    Concatenates horizontally. Assumes all files have the same length. Has low memory overhead and is speedy.

    Answer applies to Python 2. In Python 3, opening csv files is slightly different:

    readers = [csv.reader(open(fn, 'r'), newline='') for fn in csv_names]
    writer = csv.writer(open('result.csv', 'w'), newline='')
    

提交回复
热议问题