Count number of columns by a condition (>) for each row

后端 未结 4 455
小蘑菇
小蘑菇 2020-11-27 19:37

I am trying to work out for each row of a matrix how many columns have values greater than a specified value. I am sorry that I am asking this simple question but I wasn\'t

相关标签:
4条回答
  • The third argument of apply needs to be a function. Also, you can count logical trues with sum.

    apply(data, 1, function(x)sum(x > 30))
    
    0 讨论(0)
  • 2020-11-27 20:02

    This will give you the vector you are looking for:

    rowSums(data > 30)
    

    It will work whether data is a matrix or a data.frame. Also, it uses vectorized functions, hence is a preferred approach over using apply which is little more than a (slow) for loop.

    If data is a data.frame, you can add the result as a column by doing:

    data$yr.above <- rowSums(data > 30)
    

    or if data is a matrix:

    data <- cbind(data, yr.above = rowSums(data > 30))
    

    You can also create a whole new data.frame:

    data.frame(yr.above = rowSums(data > 30))
    

    or a whole new matrix:

    cbind(yr.above = rowSums(data > 30))
    
    0 讨论(0)
  • 2020-11-27 20:07

    We can also do with Reduce and + (assuming there are no NA elements)

     Reduce(`+`, lapply(as.data.frame(data), `>`, 30))
    

    This should be efficient as we are not converting to a matrix.

    0 讨论(0)
  • 2020-11-27 20:08

    With dplyr package, you can try the following two solutions.

    library(dplyr)
    df <- as.data.frame(data)
    

    Options 1

    df %>%
      mutate(yr.above = rowSums(select(df, `1990`:`1992`) > 30))
    

    Options 2

    After dplyr 1.0.0, you can use c_across() with rowwise() to make it easy to perform row-wise aggregations.

    df %>%
      rowwise() %>%
      mutate(yr.above = sum(c_across(`1990`:`1992`) > 30)) %>%
      ungroup()
    

    Note: One of the benefits for using dplyr is the support of tidy selections, which provide a concise dialect of R for selecting variables based on their names or properties.


    Output

    # # A tibble: 5 x 4
    #   `1990` `1991` `1992` yr.above
    #    <dbl>  <dbl>  <dbl>    <int>
    # 1     25     23     20        0
    # 2     22     28     20        0
    # 3     35     33     30        2
    # 4     42     40     41        3
    # 5     44     45     43        3
    
    0 讨论(0)
提交回复
热议问题