Convert numeric values into binary (0/1)

后端 未结 5 1595
余生分开走
余生分开走 2021-02-06 08:15

I have a data frame with counts of different kinds of fruits of different people. Like below

    apple  banana  orange
Tim     3       0       2
Tom     0                


        
相关标签:
5条回答
  • 2021-02-06 08:52

    Just use a comparison:

    d = t(matrix(c(3,0,2,0,1,1,1,2,2), 3))
    d > 0
    t(matrix(as.numeric(d>0), ncol(d)))
    
    0 讨论(0)
  • 2021-02-06 08:57

    Here's your data.frame:

    x <- structure(list(apple = c(3L, 0L, 1L), banana = 0:2, orange = c(2L, 
    1L, 2L)), .Names = c("apple", "banana", "orange"), class = "data.frame", row.names = c("Tim", 
    "Tom", "Bob"))
    

    And your matrix:

    as.matrix((x > 0) + 0)
        apple banana orange
    Tim     1      0      1
    Tom     0      1      1
    Bob     1      1      1
    

    Update

    I had no idea that a quick pre-bedtime posting would generate any discussion, but the discussions themselves are quite interesting, so I wanted to summarize here:

    My instinct was to simply take the fact that underneath a TRUE and FALSE in R, are the numbers 1 and 0. If you try (a not so good way) to check for equivalence, such as 1 == TRUE or 0 == FALSE, you'll get TRUE. My shortcut way (which turns out to take more time than the correct, or at least more conceptually correct way) was to just add 0 to my TRUEs and FALSEs, since I know that R would coerce the logical vectors to numeric.

    The correct, or at least, more appropriate way, would be to convert the output using as.numeric (I think that's what @JoshO'Brien intended to write). BUT.... unfortunately, that removes the dimensional attributes of the input, so you need to re-convert the resulting vector to a matrix, which, as it turns out, is still faster than adding 0 as I did in my answer.

    Having read the comments and criticisms, I thought I would add one more option---using apply to loop through the columns and use the as.numeric approach. That is slower than manually re-creating the matrix, but slightly faster than adding 0 to the logical comparison.

    x <- data.frame(replicate(1e4,sample(0:1e3)))
    library(rbenchmark)
    benchmark(X1 = {
                x1 <- as.matrix((x > 0) + 0)
              },
              X2 = {
                x2 <- apply(x, 2, function(y) as.numeric(y > 0))
              },
              X3 = {
                x3 <- as.numeric(as.matrix(x) > 0)
                x3 <- matrix(x3, nrow = 1001)
              },
              X4 = {
                x4 <- ifelse(x > 0, 1, 0)
              },
              columns = c("test", "replications", "elapsed", 
                          "relative", "user.self"))
    #   test replications elapsed relative user.self
    # 1   X1          100 116.618    1.985   110.711
    # 2   X2          100 105.026    1.788    94.070
    # 3   X3          100  58.750    1.000    46.007
    # 4   X4          100 382.410    6.509   311.567
    
    all.equal(x1, x2, check.attributes=FALSE)
    # [1] TRUE
    all.equal(x1, x3, check.attributes=FALSE)
    # [1] TRUE
    all.equal(x1, x4, check.attributes=FALSE)
    # [1] TRUE
    

    Thanks for the discussion y'all!

    0 讨论(0)
  • 2021-02-06 09:10
    > pippo
      person apple banana orange
    1    Tim     1      0      2
    2    Tom     0      1      1
    3    Bob     1      2      2
    > cols <- c("apple", "banana", "orange")
    > lapply(cols, function(x) {pippo[,x] <<- as.numeric(pippo[,x] >= 1)})
    
    0 讨论(0)
  • 2021-02-06 09:13

    use can use ifelse. It should work on both matrix as well as dataframe however, resultant value will be matrix

    > df <- cbind(aaple = c(3, 0 , 1), banana = c(0, 1, 2), orange = c(2, 1, 2))
    > df
         aaple banana orange
    [1,]     3      0      2
    [2,]     0      1      1
    [3,]     1      2      2
    
    > ifelse(df>0, 1, 0)
         aaple banana orange
    [1,]     1      0      1
    [2,]     0      1      1
    [3,]     1      1      1
    
    0 讨论(0)
  • 2021-02-06 09:15

    I usually use this approach:

    df[df > 0] = 1
    
    0 讨论(0)
提交回复
热议问题