Make a table showing the 10 largest values of a variable in R?

前端 未结 4 2067
温柔的废话
温柔的废话 2021-02-04 15:03

I want to make a simple table that showcases the largest 10 values for a given variable in my dataset, as well as 4 other variables for each observation, so basically a small su

相关标签:
4条回答
  • 2021-02-04 15:36

    This should do it...

    data <- data[with(data,order(-Score)),]

    data <- data[1:10,]

    0 讨论(0)
  • 2021-02-04 15:38

    Using sqldf:

    library(sqldf)
    sqldf("SELECT * FROM mtcars 
          ORDER BY mpg DESC 
          LIMIT 10", row.names = TRUE)
    

    Output:

                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
    Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
    Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
    Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
    Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
    Porsche 914-2  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
    Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
    Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
    Merc 230       22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
    Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
    
    0 讨论(0)
  • 2021-02-04 15:55

    You can get the highest values of a vector using the code below:

    my_vec <- c(1:100)
    tail(sort(my_vec),10)
    

    So if you want to use this method as a data frame filter you could do:

    data(mtcars)
    mtcars[mtcars$mpg %in% tail(sort(mtcars$mpg),4),]
    

    which would produce:

    > mtcars[mtcars$mpg %in% tail(sort(mtcars$mpg),4),]
                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
    Fiat 128       32.4   4 78.7  66 4.08 2.200 19.47  1  1    4    1
    Honda Civic    30.4   4 75.7  52 4.93 1.615 18.52  1  1    4    2
    Toyota Corolla 33.9   4 71.1  65 4.22 1.835 19.90  1  1    4    1
    Lotus Europa   30.4   4 95.1 113 3.77 1.513 16.90  1  1    5    2
    
    0 讨论(0)
  • 2021-02-04 15:57

    You can do this using arrange from dplyr. This should also work if there are grouping variables. Just add group_by before the arrange. We filter the first 10 observations using slice.

     library(dplyr)
     df1 %>%
        arrange(desc(Score)) %>%
        slice(1:10) 
    

    Or another option is ?top_n (commented by @docendodiscimus) from dplyr which is a wrapper that uses filter and min_rank to select the top n (i.e. 10) entries for 'Score'.

     top_n(df1, 10, Score)    
    

    Or we use filter by creating a logical condition with row_number which is equivalent to rank(ties.method='first') (contributed by @Steven Beaupre)

     filter(df1, row_number(desc(Score)) <= 10)
    

    Or a data.table option (by @David Arenburg). We convert the 'data.frame' to 'data.table' (setDT(df1)), order (decreasing) the 'Score' variable, and select the first 10 observations. .SD means Subset of DataTable.

     library(data.table)
     setDT(df1)[order(-Score), .SD[1:10]]
    
    0 讨论(0)
提交回复
热议问题