R: how to check whether a vector is ascending/descending

前端未结

关注

 2  855

vector1 = c(2, 2, 2, 2, 2, 2)
vector2 = c(2, 2, 3, 3, 3, 3)
vector3 = c(2, 2, 1, 2, 2, 2)

I want to know if the numbers in the vector are ascending/sta

相关标签:

2条回答

陌清茗

2021-02-12 21:06
You can diff to compute the differences between elements and all to check if they are all non-negative:
```
all(diff(vector1) >= 0)
# [1] TRUE
all(diff(vector2) >= 0)
# [1] TRUE
all(diff(vector3) >= 0)
# [1] FALSE
```
The above code checks if all the vectors are non-decreasing, and you could replace >= 0 with <= 0 to check if they're non-increasing. If instead your goal is to identify vectors that are either non-decreasing or non-increasing (aka they don't have an increasing and a decreasing step in the same vector), there's a simple modification:
```
!all(c(-1, 1) %in% sign(diff(vector1)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector2)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector3)))
# [1] FALSE
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

耶瑟儿～

2021-02-12 21:09

There is a base R function called is.unsorted that is ideal for this situation:

!is.unsorted(vector1)
# [1] TRUE
!is.unsorted(vector2)
# [1] TRUE
!is.unsorted(vector3)
# [1] FALSE

This function is very fast as it appeals almost directly to compiled C code.

My initial thought was to use sort and identical, a la identical(sort(vector1), vector1), but this is pretty slow; that said, I think this approach can be extended to more flexible situations.

If speed was really crucial, we could skip some of the overhead of is.unsorted and call the internal function directly:

.Internal(is.unsorted(vector1, FALSE))

(the FALSE passes FALSE to the argument strictly). This offered a ~4x speed-up on a small vector.

To get a sense of just how fast the final option is, here's a benchmark:

library(microbenchmark)
set.seed(10101)
srtd <- sort(sample(1e6, rep = TRUE)) # a sorted test case
unsr <- sample(1e6, rep = TRUE) #an unsorted test case

microbenchmark(times = 1000L,
               josilber = {all(diff(srtd) >= 0)
                         all(diff(unsr) >= 0)},
               mikec = {identical(sort(srtd), srtd)
                      identical(sort(unsr), unsr)},
               baser = {!is.unsorted(srtd)
                      !is.unsorted(unsr)},
               intern = {!.Internal(is.unsorted(srtd, FALSE)) 
                       !.Internal(is.unsorted(unsr, FALSE))})

Results on my machine:

# Unit: microseconds
#      expr       min         lq       mean     median        uq        max neval  cld
#  josilber 30349.108 30737.6440 34550.6599 34113.5970 34964.171 155283.320  1000   c 
#     mikec 93167.836 94183.8865 97119.4493 94852.7530 97528.859 229692.328  1000    d
#     baser  1089.670  1168.7400  1322.9341  1296.7375  1347.946   6301.866  1000  b  
#    intern   514.816   532.4405   576.2867   560.5955   566.236   2456.237  1000 a

So calling the internal function directly (caveat: you need to be sure your vector is perfectly clean--no NAs, etc.) gives you ~2x speed versus the base R function, which is in turn ~30x faster than using diff, which is in turn ~2x as fast as my initial choice.

0 讨论(0)