splitting CSV file by columns

后端 未结 4 672
小鲜肉
小鲜肉 2020-12-20 05:34

I have a really huge CSV files. There are about 1700 columns and 40000 rows like below:

x,y,z,x1,x2,x3,x4,x5,x6,x7,x8,x9,...(about 1700 more)...,x1700
0,0,0,         


        
相关标签:
4条回答
  • 2020-12-20 05:49

    Use a small python script like:

    fin = 'file_in.csv'
    fout1 = 'file_out1.csv'
    fout1_fd = open(fout1,'w')
    ...
    lines = []
    
    with open(fin) as fin_fd:
       lines = fin_fd.read().split('\n')
    
    for l in lines:
       l_arr = l.split(',')
       fout1_fd.write(','.join(l_arr[0:3]))        
       fout1_fd.write('\n')   
       ...
    
    ...
    fout1_fd.close()
    ...
    
    0 讨论(0)
  • 2020-12-20 05:54

    A one-line solution for your example data and desired output:

    cut -d, -f -3 huge.csv > file1.csv
    cut -d, -f 4-1004 huge.csv > file2.csv
    cut -d, -f 1005- huge.csv > file3.csv
    

    The cut program is available on most POSIX platforms and is part of GNU Core Utilities. There is also a Windows version.

    update in python, since the OP asked for a program in an acceptable language:

    # python 3 (or python 2, if you must)
    import csv
    import fileinput
    
    output_specifications = (  # csv file name, selector function
        ('file1.csv', slice(3)),
        ('file2.csv', slice(3, 1003)),
        ('file3.csv', slice(1003, 1703)),
    )
    output_row_writers = [
        (
            csv.writer(open(file_name, 'wb'), quoting=csv.QUOTE_MINIMAL).writerow,
            selector,
        ) for file_name, selector in output_specifications
    ]
    
    reader = csv.reader(fileinput.input())
    for row in reader:
        for row_writer, selector in output_row_writers:
            row_writer(row[selector])
    

    This works with the sample data given and can be called with the input.csv as an argument or by piping from stdin.

    0 讨论(0)
  • 2020-12-20 06:03

    You can open the file in Microsoft Excel, delete the extra columns, save as csv for file #1. Repeat the same procedure for the other 2 tables.

    0 讨论(0)
  • 2020-12-20 06:03

    I usually use open office ( or microsof excel in case you are using windows) to do that without writing any program and change the file and save it. Following are two useful links showing how to do that.

    https://superuser.com/questions/407082/easiest-way-to-open-csv-with-commas-in-excel

    http://office.microsoft.com/en-us/excel-help/import-or-export-text-txt-or-csv-files-HP010099725.aspx

    0 讨论(0)
提交回复
热议问题