Reshaping data frame with duplicates

后端 未结 4 2091
庸人自扰
庸人自扰 2020-12-01 15:19

I have what should be a simple reshaping problem, but I can\'t figure it out. Part of my data looks like this:

foo <- structure(list(grade = c(3, 3, 4, 4,         


        
相关标签:
4条回答
  • 2020-12-01 15:39

    It is not as pretty as reshape, but

    data.frame(grade = foo[2 * (1:(nrow(foo)/2)),]$grade, 
               SS =  foo[foo$var.type == "SS", ]$var.val, 
               SE =  foo[foo$var.type == "SE", ]$var.val ) 
    

    produces

       grade  SS SE
    1      3 120 47
    2      4 120 46
    3      5 120 46
    4      6 120 47
    5      7 120 46
    6      8 120 46
    7      3 120 12
    8      4 120 14
    9      5 120 16
    10     6 120 20
    

    You have to assume the data comes in pairs of rows for this.

    0 讨论(0)
  • 2020-12-01 15:50
    library(plyr)
    library(reshape2)
    # First we add a grouping variable to deal with the duplicates
    foo <- ddply(foo, .(grade, var.type), function(x) { x$group <- 1:nrow(x); x })
    dcast(foo, grade + group ~ var.type, value.var= "var.val")[-2]
    
     grade SE  SS
    1      3 47 120
    2      3 12 120
    3      4 46 120
    4      4 14 120
    5      5 46 120
    6      5 16 120
    7      6 47 120
    8      6 20 120
    9      7 46 120
    10     8 46 120
    
    0 讨论(0)
  • 2020-12-01 15:58

    If you want to reshape and you have duplicates, you're going to need to give each pair a unique id:

    foorle <- rle(foo$grade)
    fooids <- rep(seq_len(length(foorle$values)), times=foorle$lengths)
    
    fooids
     [1]  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10
    

    Now you'll be able to use reshape properly:

    idfoo <- cbind(id=fooids, foo)
    
    library(reshape)
    dcast(idfoo, id+grade~var.type, value.var="var.val")
    
       id grade SE  SS
    1   1     3 47 120
    2   2     4 46 120
    3   3     5 46 120
    4   4     6 47 120
    5   5     7 46 120
    6   6     8 46 120
    7   7     3 12 120
    8   8     4 14 120
    9   9     5 16 120
    10 10     6 20 120
    

    EDIT: Please note I'm assuming your data is in order, else you'll have problems distinguishing between duplicates. If it isn't, you can always use order so that it is.

    0 讨论(0)
  • 2020-12-01 16:05

    If you don't have any duplicates, this will work nicely:

    ss <- subset(foo, var.type=='SS')
    se <- subset(foo, var.type=='SE')
    ss <- data.frame(grade=ss$grade,SS=ss$var.val)
    se <- data.frame(grade=se$grade,SE=se$var.val)
    bar <- merge(ss,se,by='grade')
    
    0 讨论(0)
提交回复
热议问题