Converting from a character to a numeric data frame

后端 未结 2 1402
-上瘾入骨i
-上瘾入骨i 2021-01-04 18:16

I have a character data frame in R which has NaNs in it. I need to remove any row with a NaN and then convert it to a numeric data frame.

I

相关标签:
2条回答
  • 2021-01-04 18:54

    As @thijs van den bergh points you to,

    dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)
    
    dat <- as.data.frame(sapply(dat, as.numeric)) #<- sapply is here
    
    dat[complete.cases(dat), ]
    #  x y
    #2 2 3
    

    Is one way to do this.

    Your error comes from trying to make a data.frame numeric. The sapply option I show is instead making each column vector numeric.

    0 讨论(0)
  • Note that data.frames are not numeric or character, but rather are a list which can be all numeric columns, all character columns, or a mix of these or other types (e.g.: Date/logical).

    dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)
    is.list(dat)
    # [1] TRUE
    

    The example data just has two character columns:

    > str(dat)
    'data.frame':   2 obs. of  2 variables:
     $ x: chr  "NaN" "2"
     $ y: chr  "NaN" "3
    

    ...which you could add a numeric column to like so:

    > dat$num.example <- c(6.2,3.8)
    > dat
        x   y num.example
    1 NaN NaN         6.2
    2   2   3         3.8
    > str(dat)
    'data.frame':   2 obs. of  3 variables:
     $ x          : chr  "NaN" "2"
     $ y          : chr  "NaN" "3"
     $ num.example: num  6.2 3.8
    

    So, when you try to do as.numeric R gets confused because it is wondering how to convert this list object which may have multiple types in it. user1317221_G's answer uses the ?sapply function, which can be used to apply a function to the individual items of an object. You could alternatively use ?lapply which is a very similar function (read more on the *apply functions here - R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate )

    I.e. - in this case, to each column of your data.frame, you can apply the as.numeric function, like so:

    data.frame(lapply(dat,as.numeric))
    

    The lapply call is wrapped in a data.frame to make sure the output is a data.frame and not a list. That is, running:

    lapply(dat,as.numeric)
    

    will give you:

    > lapply(dat,as.numeric)
    $x
    [1] NaN   2
    
    $y
    [1] NaN   3
    
    $num.example
    [1] 6.2 3.8
    

    While:

    data.frame(lapply(dat,as.numeric))
    

    will give you:

    >  data.frame(lapply(dat,as.numeric))
        x   y num.example
    1 NaN NaN         6.2
    2   2   3         3.8
    
    0 讨论(0)
提交回复
热议问题