vector1 = c(2, 2, 2, 2, 2, 2)
vector2 = c(2, 2, 3, 3, 3, 3)
vector3 = c(2, 2, 1, 2, 2, 2)
I want to know if the numbers in the vector are ascending/sta
You can diff
to compute the differences between elements and all
to check if they are all non-negative:
all(diff(vector1) >= 0)
# [1] TRUE
all(diff(vector2) >= 0)
# [1] TRUE
all(diff(vector3) >= 0)
# [1] FALSE
The above code checks if all the vectors are non-decreasing, and you could replace >= 0
with <= 0
to check if they're non-increasing. If instead your goal is to identify vectors that are either non-decreasing or non-increasing (aka they don't have an increasing and a decreasing step in the same vector), there's a simple modification:
!all(c(-1, 1) %in% sign(diff(vector1)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector2)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector3)))
# [1] FALSE
There is a base R
function called is.unsorted
that is ideal for this situation:
!is.unsorted(vector1)
# [1] TRUE
!is.unsorted(vector2)
# [1] TRUE
!is.unsorted(vector3)
# [1] FALSE
This function is very fast as it appeals almost directly to compiled C
code.
My initial thought was to use sort
and identical
, a la identical(sort(vector1), vector1)
, but this is pretty slow; that said, I think this approach can be extended to more flexible situations.
If speed was really crucial, we could skip some of the overhead of is.unsorted
and call the internal function directly:
.Internal(is.unsorted(vector1, FALSE))
(the FALSE
passes FALSE
to the argument strictly
). This offered a ~4x speed-up on a small vector.
To get a sense of just how fast the final option is, here's a benchmark:
library(microbenchmark)
set.seed(10101)
srtd <- sort(sample(1e6, rep = TRUE)) # a sorted test case
unsr <- sample(1e6, rep = TRUE) #an unsorted test case
microbenchmark(times = 1000L,
josilber = {all(diff(srtd) >= 0)
all(diff(unsr) >= 0)},
mikec = {identical(sort(srtd), srtd)
identical(sort(unsr), unsr)},
baser = {!is.unsorted(srtd)
!is.unsorted(unsr)},
intern = {!.Internal(is.unsorted(srtd, FALSE))
!.Internal(is.unsorted(unsr, FALSE))})
Results on my machine:
# Unit: microseconds
# expr min lq mean median uq max neval cld
# josilber 30349.108 30737.6440 34550.6599 34113.5970 34964.171 155283.320 1000 c
# mikec 93167.836 94183.8865 97119.4493 94852.7530 97528.859 229692.328 1000 d
# baser 1089.670 1168.7400 1322.9341 1296.7375 1347.946 6301.866 1000 b
# intern 514.816 532.4405 576.2867 560.5955 566.236 2456.237 1000 a
So calling the internal function directly (caveat: you need to be sure your vector is perfectly clean--no NA
s, etc.) gives you ~2x speed versus the base R
function, which is in turn ~30x faster than using diff
, which is in turn ~2x as fast as my initial choice.