R subtract value for the same ID (from the first ID that shows)

前端 未结 3 2012
感情败类
感情败类 2020-12-22 04:25

I have a similar question to the question found here: R, subtract value from previous row, group by (slight modification; see below):

In R, lets say I have this da

相关标签:
3条回答
  • 2020-12-22 04:38

    Let's assume that dates are already sorted. I would probably retrieve the first value for each id and then use this to compute the diff feature. Something like this.

    my.df <- data.frame(
      id = c(2380, 2380, 2380, 2380, 20100,20100,20100, 20100, 20103, 20103),
      date = c("10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12"),
      value = c(21.01, 22.04, 22.65, 23.11, 35.21, 37.07, 38.17, 38.97, 57.98, 60.83),
      stringsAsFactors = F)
    #
    # get ids
    my.ids <- unique(my.df$id) # or levels(my.df$id)
    
    # get first val (assuming sorting by date)
    id.val0 <- sapply(my.ids, (function(id){
      my.df$value[my.df$id == id][1]
    }))
    names(id.val0) <- my.ids
    
    # do operation
    my.df$diff <- sapply(1:nrow(my.df), (function(i){
      tmp.id <- my.df$id[i]
      my.df$value[i] - id.val0[as.character(tmp.id)]
    }))
    
    0 讨论(0)
  • 2020-12-22 04:51

    We can use functions from dplyr. dt is the original data. dt2 is the final output.

    library(dplyr)
    
    dt2 <- dt %>%
      group_by(id) %>%
      mutate(diff = value - first(value))
    
    dt2
    Source: local data frame [10 x 4]
    Groups: id [3]
    
          id       date value  diff
       <int>      <chr> <dbl> <dbl>
    1   2380 10/30/2012 21.01  0.00
    2   2380 10/31/2012 22.04  1.03
    3   2380  11/1/2012 22.65  1.64
    4   2380  11/2/2012 23.11  2.10
    5  20100 10/30/2012 35.21  0.00
    6  20100 10/31/2012 37.07  1.86
    7  20100  11/1/2012 38.17  2.96
    8  20100  11/2/2012 38.97  3.76
    9  20103 10/30/2012 57.98  0.00
    10 20103 10/31/2012 60.83  2.85
    
    0 讨论(0)
  • 2020-12-22 05:00

    Using base R, you could do something like this.

    res <- lapply(split(Data, Data$id), function(x) {
                x$diff <- x$value - x$value[1]
                x})
    res <- do.call(rbind, res)
    row.names(res) <- NULL
    res
          id     date value diff
    1   2380 10/30/12 21.01 0.00
    2   2380 10/31/12 22.04 1.03
    3   2380  11/1/12 22.65 1.64
    4   2380  11/2/12 23.11 2.10
    5  20100 10/30/12 35.21 0.00
    6  20100 10/31/12 37.07 1.86
    7  20100  11/1/12 38.17 2.96
    8  20100  11/2/12 38.97 3.76
    9  20103 10/30/12 57.98 0.00
    10 20103 10/31/12 60.83 2.85
    
    0 讨论(0)
提交回复
热议问题