How to use cast or another function to create a binary table in R

前端 未结 4 372
予麋鹿
予麋鹿 2020-12-02 01:06

I am trying to create a list of factors that have a binary response and have been using cast.

DF2 <- cast(data.frame(DM), id ~ region)
names(DF2)[-1] <         


        
相关标签:
4条回答
  • 2020-12-02 01:27

    No specialized functions are needed:

    x <- data.frame(id=1:4, region=factor(c(3,2,1,2)))
    x
       id region
    1  1      3
    2  2      2
    3  3      1
    4  4      2
    
    x.bin <- data.frame(x$id, sapply(levels(x$region), `==`, x$region))
    names(x.bin) <- c("id", paste("region", levels(x$region),sep=''))
    x.bin
    
      id region1 region2 region3
    1  1   FALSE   FALSE    TRUE
    2  2   FALSE    TRUE   FALSE
    3  3    TRUE   FALSE   FALSE
    4  4   FALSE    TRUE   FALSE
    

    Or for integer results:

    x.bin2 <- data.frame(x$id,  
        apply(sapply(levels(x$region), `==`, x$region),2,as.integer)
    ) 
    names(x.bin2) <- c("id", paste("region", levels(x$region),sep=''))
    x.bin2
    
    
      id region1 region2 region3
    1  1       0       0       1
    2  2       0       1       0
    3  3       1       0       0
    4  4       0       1       0
    
    0 讨论(0)
  • 2020-12-02 01:30

    Original data:

    x <- data.frame(id=c(1,1,2,3,3), region=factor(c(2,3,2,1,1)))
    
    > x
      id region
    1  1      2
    2  1      3
    3  2      2
    4  3      1
    5  3      1
    

    Group up the data:

    aggregate(model.matrix(~ region - 1, data=x), x["id"], max)
    

    Result:

      id region1 region2 region3
    1  1       0       1       1
    2  2       0       1       0
    3  3       1       0       0
    
    0 讨论(0)
  • 2020-12-02 01:31

    I kind of prefer dcast from reshape2:

    library(reshape2)
    dat <- read.table(text = "id region
     1   2
     1   3
     2   2
     3   1
     3   1",header = TRUE,sep = "")
    
    dcast(dat,id~region,fun.aggregate = function(x){as.integer(length(x) > 0)})
    
      id 1 2 3
    1  1 0 1 1
    2  2 0 1 0
    3  3 1 0 0
    

    There may be a smoother way to do that, but I'll be honest I don't cast stuff all that often.

    0 讨论(0)
  • 2020-12-02 01:36

    Here's sort of a "tricky" way to do it in one line using table (the brackets are important). Assuming your data.frame is named df:

    (table(df) > 0)+0
    #    region
    # id  1 2 3
    #   1 0 1 1
    #   2 0 1 0
    #   3 1 0 0
    

    table(df) > 0 gives us TRUE and FALSE; adding +0 converts the TRUE and FALSE to numbers.

    0 讨论(0)
提交回复
热议问题