Most elegant way to load csv with point as thousands separator in R

后端 未结 2 640
-上瘾入骨i
-上瘾入骨i 2021-01-02 08:16

NB: To the best of my knowledge this question is not a duplicate! All the questios/answers I found are either how to eliminate points from data that are alr

相关标签:
2条回答
  • 2021-01-02 09:03

    Adapted from this post: Specify custom Date format for colClasses argument in read.table/read.csv

    #some sample data
    write.csv(data.frame(a=c("1.234,56","1.234,56"),
                         b=c("1.234,56","1.234,56")),
              "test.csv",row.names=FALSE,quote=TRUE)
    
    #define your own numeric class
    setClass('myNum')
    #define conversion
    setAs("character","myNum", function(from) as.numeric(gsub(",","\\.",gsub("\\.","",from))))
    
    #read data with custom colClasses
    read_data=read.csv("test.csv",stringsAsFactors=FALSE,colClasses=c("myNum","myNum"))
    #let's try whether this is really a numeric
    read_data[1,1]*2
    
    #[1] 2469.12
    
    0 讨论(0)
  • 2021-01-02 09:13

    Rather than try to fix it all at loading time, I would load the data into R as a string, then process it to numeric.

    So after loading, it's a column of strings like "4.123,98"

    Then do something like:

     number.string <- gsub("\\.", "", number.string)
     number.string <- gsub(",", "\\.", number.string)
     number <- as.numeric(number.string)
    
    0 讨论(0)
提交回复
热议问题