问题
I have a dataset of values that has multiple columns (for different sites) and rows (for different days) that I am trying to rank for each day using R. I would like the rank the data for each column (site) from the total number of sites within one day (so ranking based on each row). It would be possible to do in Excel, but would obviously take a long time. Below is a [much smaller] example of what i'm trying to achieve:
date - site1 - site2 - site3 - site4
1/1/00 - 24 - 33 - 10 - 13
2/1/00 - 13 - 25 - 6 - 2
~~ leading to:
date - site1 - site2 - site3 - site4
1/1/00 - 2 - 1 - 4 - 3
2/1/00 - 2 - 1 - 3 - 4
hopefully there's some simple command, thanks a lot!
回答1:
You can use rank
to give the ranks of the data.
# your data
mydf <- read.table(text="date - site1 - site2 - site3 - site4
1/1/00 - 24 - 33 - 10 - 13
2/1/00 - 13 - 25 - 6 - 2", sep="-", header=TRUE)
# find ranks
t(apply(-mydf[-1], 1, rank))
# add to your dates
mydf.rank <- cbind(mydf[1], t(apply(-mydf[-1], 1, rank)))
About the code
mydf[-1] # removes the first column
-mydf[-1] #using the `-` negates the values -so the rank goes in decreasing order
apply
with MARGIN=1 finds the ranks across rows
The t
transposes the matrix to give the output as you want
回答2:
This is a tidy way.
Reshape to long format, sort (arrange), group, and spread. The only tricky part is knowing that sorting groups means you've automatically ranked them (either ascending or descending). The function row_number
acknowledges this.
library(tidyverse)
library(lubridate)
# Data
df <- tribble(
~date, ~site1, ~site2, ~site3, ~site4,
mdy("1/1/2000"), 24, 33, 10, 13,
mdy("2/1/2000"), 13, 25, 6, 2
)
df %>%
gather(site, days, -date) %>% #< Make Tidy
arrange(date, desc(days)) %>% #< Sort relevant columns
group_by(date) %>%
mutate(ranking = row_number()) %>% #< Ranking function
select(-days) %>% #< Remove unneeded column. Worth keeping in tidy format!
spread(site, ranking)
#> # A tibble: 2 x 5
#> # Groups: date [2]
#> date site1 site2 site3 site4
#> <date> <int> <int> <int> <int>
#> 1 2000-01-01 2 1 4 3
#> 2 2000-02-01 2 1 3 4
Created on 2018-03-06 by the reprex package (v0.2.0).
来源:https://stackoverflow.com/questions/23530731/ranking-rows-in-r