What's the fastest way to merge multiple csv files by column?

后端 未结 5 806
长情又很酷
长情又很酷 2021-02-09 04:08

I have about 50 CSV files with 60,000 rows in each, and a varying number of columns. I want to merge all the CSV files by column. I\'ve tried doing this in MATLAB by transposing

5条回答
  •  说谎
    说谎 (楼主)
    2021-02-09 04:57

    Use Go: https://github.com/chrislusf/gleam

    Assume there are file "a.csv" has fields "a1, a2, a3, a4, a5".

    And assume file "b.csv" has fields "b1, b2, b3".

    We want to join the rows where a1 = b2. And the output format should be "a1, a4, b3".

    package main
    
    import (
        "os"
    
        "github.com/chrislusf/gleam"
        "github.com/chrislusf/gleam/source/csv"
    )
    
    func main() {
    
        f := gleam.New()
        a := f.Input(csv.New("a.csv")).Select(1,4) // a1, a4
        b := f.Input(csv.New("b.csv")).Select(2,3) // b2, b3
    
        a.Join(b).Fprintf(os.Stdout, "%s,%s,%s\n").Run()  // a1, a4, b3
    
    }
    

提交回复
热议问题