Most elegant way to load csv with point as thousands separator in R

偶尔善良 提交于 2019-11-30 19:05:54
cryo111

Adapted from this post: Specify custom Date format for colClasses argument in read.table/read.csv

#some sample data
write.csv(data.frame(a=c("1.234,56","1.234,56"),
                     b=c("1.234,56","1.234,56")),
          "test.csv",row.names=FALSE,quote=TRUE)

#define your own numeric class
setClass('myNum')
#define conversion
setAs("character","myNum", function(from) as.numeric(gsub(",","\\.",gsub("\\.","",from))))

#read data with custom colClasses
read_data=read.csv("test.csv",stringsAsFactors=FALSE,colClasses=c("myNum","myNum"))
#let's try whether this is really a numeric
read_data[1,1]*2

#[1] 2469.12

Rather than try to fix it all at loading time, I would load the data into R as a string, then process it to numeric.

So after loading, it's a column of strings like "4.123,98"

Then do something like:

 number.string <- gsub("\\.", "", number.string)
 number.string <- gsub(",", "\\.", number.string)
 number <- as.numeric(number.string)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!