Ranking multiple columns by different orders using data table

99封情书 提交于 2021-02-16 14:32:08

问题


Using my example below, how can I rank multiple columns using different orders, so for example rank y as descending and z as ascending?

require(data.table)

dt <- data.table(x = c(rep("a", 5), rep("b", 5)),
y = abs(rnorm(10)) * 10, z = abs(rnorm(10)) * 10)

cols <- c("y", "z")

dt[, paste0("rank_", cols) := lapply(.SD, function(x) frankv(x, ties.method = "min")), .SDcols = cols, by = .(x)]

回答1:


data.table's frank() function has some useful features which aren't available in base R's rank() function (see ?frank). E.g., we can reverse the order of the ranking by prepending the variable with a minus sign:

library(data.table)
# create reproducible data
set.seed(1L)
dt <- data.table(x = c(rep("a", 5), rep("b", 5)),
                 y = abs(rnorm(10)) * 10, z = abs(rnorm(10)) * 10)
# rank y descending, z ascending
dt[, rank_y := frank(-y), x][, rank_z := frank(z), x][]
    x         y          z rank_y rank_z
 1: a  6.264538 15.1178117      3      4
 2: a  1.836433  3.8984324      5      1
 3: a  8.356286  6.2124058      2      2
 4: a 15.952808 22.1469989      1      5
 5: a  3.295078 11.2493092      4      3
 6: b  8.204684  0.4493361      1      2
 7: b  4.874291  0.1619026      4      1
 8: b  7.383247  9.4383621      2      5
 9: b  5.757814  8.2122120      3      4
10: b  3.053884  5.9390132      5      3

If there are many columns which are to be ranked individually, some descending, some ascending, we can do this in two steps

# first rank all columns in descending order
cols_desc <- c("y")
dt[, paste0("rank_", cols_desc) := lapply(.SD, frankv, ties.method = "min", order = -1L), 
   .SDcols = cols_desc, by = x][]
# then rank all columns in ascending order
cols_asc <- c("z")
dt[, paste0("rank_", cols_asc) := lapply(.SD, frankv, ties.method = "min", order = +1L), 
   .SDcols = cols_asc, by = x][]
    x         y          z rank_y rank_z
 1: a  6.264538 15.1178117      3      4
 2: a  1.836433  3.8984324      5      1
 3: a  8.356286  6.2124058      2      2
 4: a 15.952808 22.1469989      1      5
 5: a  3.295078 11.2493092      4      3
 6: b  8.204684  0.4493361      1      2
 7: b  4.874291  0.1619026      4      1
 8: b  7.383247  9.4383621      2      5
 9: b  5.757814  8.2122120      3      4
10: b  3.053884  5.9390132      5      3


来源:https://stackoverflow.com/questions/46280337/ranking-multiple-columns-by-different-orders-using-data-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!