Quicker way to read single column of CSV file

前端 未结 2 688
遇见更好的自我
遇见更好的自我 2021-02-13 17:51

I am trying to read a single column of a CSV file to R as quickly as possible. I am hoping to cut down on standard methods in terms of the time it take

相关标签:
2条回答
  • 2021-02-13 18:27

    There is a speed comparison of methods to read large CSV files in this blog. fread is the fastest by an order of magnitude.

    As mentioned in the comments above, you can use the select parameter to select which columns to read - so:

    fread("main.csv",sep = ",", select = c("f1") ) 
    

    will work

    0 讨论(0)
  • 2021-02-13 18:46

    I would suggest

    scan(pipe("cut -f1 -d, Main.csv"))
    

    This differs from the original proposal (read.table(pipe("cut -f1 Main.csv"))) in a couple of different ways:

    • since the file is comma-separated and cut assumes tab-separation by default, you need to specify d, to specify comma-separation
    • scan() is much faster than read.table for simple/unstructured data reads.

    According to the comments by the OP this takes about 4 rather than 40+ seconds.

    0 讨论(0)
提交回复
热议问题