Select rows of a matrix that meet a condition

前端 未结 6 593
半阙折子戏
半阙折子戏 2020-11-29 15:26

In R with a matrix:

     one two three four
 [1,]   1   6    11   16
 [2,]   2   7    12   17
 [3,]   3   8    11   18
 [4,]   4   9    11   19
 [5,]   5  10         


        
相关标签:
6条回答
  • Subset is a very slow function , and I personally find it useless.

    I assume you have a data.frame, array, matrix called Mat with A, B, C as column names; then all you need to do is:

    • In the case of one condition on one column, lets say column A

      Mat[which(Mat[,'A'] == 10), ]
      

    In the case of multiple conditions on different column, you can create a dummy variable. Suppose the conditions are A = 10, B = 5, and C > 2, then we have:

        aux = which(Mat[,'A'] == 10)
        aux = aux[which(Mat[aux,'B'] == 5)]
        aux = aux[which(Mat[aux,'C'] > 2)]
        Mat[aux, ]
    

    By testing the speed advantage with system.time, the which method is 10x faster than the subset method.

    0 讨论(0)
  • 2020-11-29 15:45

    If your matrix is called m, just use :

    R> m[m$three == 11, ]
    
    0 讨论(0)
  • 2020-11-29 15:46
    m <- matrix(1:20, ncol = 4) 
    colnames(m) <- letters[1:4]
    

    The following command will select the first row of the matrix above.

    subset(m, m[,4] == 16)
    

    And this will select the last three.

    subset(m, m[,4] > 17)
    

    The result will be a matrix in both cases. If you want to use column names to select columns then you would be best off converting it to a dataframe with

    mf <- data.frame(m)
    

    Then you can select with

    mf[ mf$a == 16, ]
    

    Or, you could use the subset command.

    0 讨论(0)
  • 2020-11-29 15:47

    If the dataset is called data, then all the rows meeting a condition where value of column 'pm2.5' > 300 can be received by -

    data[data['pm2.5'] >300,]

    0 讨论(0)
  • 2020-11-29 15:48

    This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.

    To perform the operation on a matrix, you can define a column by name:

    m[m[, "three"] == 11,]
    

    Or by number:

    m[m[,3] == 11,]
    

    Note that if only one row matches, the result is an integer vector, not a matrix.

    0 讨论(0)
  • 2020-11-29 15:56

    I will choose a simple approach using the dplyr package.

    If the dataframe is data.

    library(dplyr)
    result <- filter(data, three == 11)
    
    0 讨论(0)
提交回复
热议问题