Coerce variables in data frame to appropriate format

前端 未结 2 1504
暗喜
暗喜 2021-01-26 00:34

I\'m working a data frame which consists of multiple different data types (numerics, characters, timestamps), but unfortunately all of them are received as characters. Hence I n

2条回答
  •  执念已碎
    2021-01-26 00:54

    You should check dataPreparation package. You will find function findAndTransformNumerics function that will do exactly what you want.

    require(dataPreparation)
    data("messy_adult")
    sapply(messy_adult[, .(num1, num2, mail)], class)
       num1        num2        mail 
    "character" "character"    "factor" 
    

    messy_adult is an ugly data set to illustrate functions from this package. Here num1 and num2 are strings :/

    messy_adult <- findAndTransformNumerics(messy_adult)
    [1] "findAndTransformNumerics: It took me 0.18s to identify 3 numerics column(s), i will set them as numerics"
    [1] "setColAsNumeric: I will set some columns as numeric"
    [1] "setColAsNumeric: I am doing the columnnum1"
    [1] "setColAsNumeric: 0 NA have been created due to transformation to numeric."
    [1] "setColAsNumeric: I will set some columns as numeric"
    [1] "setColAsNumeric: I am doing the columnnum2"
    [1] "setColAsNumeric: 0 NA have been created due to transformation to numeric."
    [1] "setColAsNumeric: I am doing the columnnum3"
    [1] "setColAsNumeric: 0 NA have been created due to transformation to numeric."
    [1] "findAndTransformNumerics: It took me 0.09s to transform 3 column(s) to a numeric format."
    

    Here we performed the search and it logged what it found

    And know:

    sapply(messy_adult[, .(num1, num2, mail)], class)
         num1      num2      mail 
    "numeric" "numeric"  "factor" 
    

    Hope it helps!

    Disclamer: I'm the author of this package.

提交回复
热议问题