How to convert factor format to numeric format in R without changing the values? [duplicate]

混江龙づ霸主 提交于 2019-12-06 04:53:14

问题


Below is dataframe df1 of which I want to convert column "V2" from factor format to numeric without changing the current values (0 ; 0 ; 8,5 ; 3).

df1=

             V1  V2 V3       X2 X3
4470 2010-03-28   0  A 21.53675  0
4471 2010-03-29   0  A 19.21611  0
4472 2010-03-30 8,5  A 21.54541  0
4473 2010-03-31   3  A       NA NA

Since column "V2" is in factor format I first convert it to character format: df1[,2]=as.character(df1[,2])

Then I try to convert "V2" to numeric format:

df1[,2]=as.numeric(df1[,2])

Leading to this R message:

Warning message: NAs introduced by coercion

And the dataframe below where df[3,2] has changed into "NA" instead of remaining "8,5"..

             V1 V2 V3       X2 X3
4470 2010-03-28  0  A 21.53675  0
4471 2010-03-29  0  A 19.21611  0
4472 2010-03-30 NA  A 21.54541  0
4473 2010-03-31  3  A       NA NA 

It might have to do with the fact that 8,5 is not a whole number. Still I do not know how to solve this problem. Help would be much appreciated!


回答1:


Replace comma's with dots, which represent decimals in R. Otherwise R thinks it is a character and coerces the value to NA.

Then, to extract values:

as.numeric(levels(df1[,2])[df[,2]])

(thanks @SimonO101 for the correction)




回答2:


Try this to replace the comma in your data:

fac<- c( "0" , "0" , "1,5" , "0" , "0" , "8" )
#[1] "0"   "0"   "1,5" "0"   "0"   "8" 
fac <- as.numeric( sub(",", ".", fac) )
#[1] 0.0 0.0 1.5 0.0 0.0 8.0

More generally converting factors to their underlying values rather than the factor representation:

fac <- as.factor( fac )
as.numeric(fac)
#[1] 1 1 2 1 1 3
as.numeric(as.character(fac))
#[1] 0.0 0.0 1.5 0.0 0.0 8.0

However, this is the canonical way of transforming to original values

 as.numeric(levels(fac))[fac]

From the help page ?as.factor

In particular, as.numeric applied to a factor is meaningless, and may happen by implicit coercion. To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)).




回答3:


Add the following line of code after you converted to character:

df[3,2] <- 8.5

You should then be able to convert characters to numerics. Since R's default decimal separator is . and not ,, your value is replaced by NA without that step.



来源:https://stackoverflow.com/questions/16335516/how-to-convert-factor-format-to-numeric-format-in-r-without-changing-the-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!