R - Comparing values in a column and creating a new column with the results of this comparison. Is there a better way than looping?

后端 未结 3 1727
余生分开走
余生分开走 2021-01-06 11:41

I\'m a beginner of R. Although I have read a lot in manuals and here at this board, I have to ask my first question. It\'s a little bit the same as here but not really the s

相关标签:
3条回答
  • 2021-01-06 11:54

    There are a couple things to consider in your example.

    First, to avoid a loop, you can create a copy of the vector that is shifted by one position. (There are about 20 ways to do this.) Then when you test vector B vs C it will do element-by-element comparison of each position vs its neighbor.

    Second, equality comparisons don't work with NA -- they always return NA. So NA == NA is not TRUE it is NA! Again, there are about 20 ways to get around this, but here I have just replaced all the NAs in the temporary vector with a placeholder that will work for the tests of equality.

    Finally, you have to decide what you want to do with the last value (which doesn't have a neighbor). Here I have put 1, which is your assignment for "doesn't match its neighbor".

    So, depending on the range of values possible in b, you could do

    c = df$b 
    z = length(c)
    c[is.na(c)] = 'x'   # replace NA with value that will allow equality test
    df$mov = c(1 * !(c[1:z-1] == c[2:z]),1)     # add 1 to the end for the last value
    
    0 讨论(0)
  • 2021-01-06 12:00

    You could do something like this to mark the ones which match

    df$bnext <- c(tail(df$b,-1),NA)
    df$bnextsame <- ifelse(df$bnext == df$b | (is.na(df$b) & is.na(df$bnext)),0,1)
    

    There are plenty of NAs here because there are plenty of NAs in your column b as well and any comparison with NA returns an NA and not a TRUE/FALSE. You could add a df[is.na(df$bnextsame),"bnextsame"] <- 0 to fix that.

    0 讨论(0)
  • 2021-01-06 12:15

    You can use a "rolling equality test" with zoo 's rollapply. Also, identical is preferred to ==.

    #identical(NA, NA)
    #[1] TRUE
    #NA == NA
    #[1] NA
    
    library(zoo)
    
    df$mov <- c(rollapply(df$b, width = 2, 
            FUN = function(x) as.numeric(!identical(x[1], x[2]))), "no_comparison")
          #`!` because you want `0` as `TRUE` ;
          #I added a "no_comparison" to last value as it is not compared with any one
    df
    #   a  b           mov
    #1  5  1             0
    #2  1  1             0
    #3  9  1             1
    #4  5 NA             1
    #5  9  1             1
    #.....
    #19 1 NA             0
    #20 1 NA no_comparison
    
    0 讨论(0)
提交回复
热议问题