When there are ties in the original data, is there a way to create a ranking without gaps in the ranks (consecutive, integer rank values)? Suppose:
x <-
For those fond of using dplyr
:
dense_rank(x)
[1] 2 2 2 1 1 3 3
The "loopless" way to do it is to simply treat the vector as an ordered factor, then convert it to numeric:
> as.numeric( ordered( c( 10,10,10,10, 5,5,5, 10, 10 ) ) )
[1] 2 2 2 2 1 1 1 2 2
> as.numeric( ordered( c(0.5,0.56,0.76,0.23,0.33,0.4) ))
[1] 4 5 6 1 2 3
> as.numeric( ordered( c(1,1,2,3,4,5,8,8) ))
[1] 1 1 2 3 4 5 6 6
Update: Another way, that seems faster is to use findInterval
and sort(unique())
:
> x <- c( 10, 10, 10, 10, 5,5,5, 10, 10)
> findInterval( x, sort(unique(x)))
[1] 2 2 2 2 1 1 1 2 2
> x <- round( abs( rnorm(1000000)*10))
> system.time( z <- as.numeric( ordered( x )))
user system elapsed
0.996 0.025 1.021
> system.time( z <- findInterval( x, sort(unique(x))))
user system elapsed
0.077 0.003 0.080