R: how to check whether a vector is ascending/descending

前端 未结 2 850
被撕碎了的回忆
被撕碎了的回忆 2021-02-12 20:31
vector1 = c(2, 2, 2, 2, 2, 2)
vector2 = c(2, 2, 3, 3, 3, 3)
vector3 = c(2, 2, 1, 2, 2, 2)

I want to know if the numbers in the vector are ascending/sta

相关标签:
2条回答
  • 2021-02-12 21:06

    You can diff to compute the differences between elements and all to check if they are all non-negative:

    all(diff(vector1) >= 0)
    # [1] TRUE
    all(diff(vector2) >= 0)
    # [1] TRUE
    all(diff(vector3) >= 0)
    # [1] FALSE
    

    The above code checks if all the vectors are non-decreasing, and you could replace >= 0 with <= 0 to check if they're non-increasing. If instead your goal is to identify vectors that are either non-decreasing or non-increasing (aka they don't have an increasing and a decreasing step in the same vector), there's a simple modification:

    !all(c(-1, 1) %in% sign(diff(vector1)))
    # [1] TRUE
    !all(c(-1, 1) %in% sign(diff(vector2)))
    # [1] TRUE
    !all(c(-1, 1) %in% sign(diff(vector3)))
    # [1] FALSE
    
    0 讨论(0)
  • 2021-02-12 21:09

    There is a base R function called is.unsorted that is ideal for this situation:

    !is.unsorted(vector1)
    # [1] TRUE
    !is.unsorted(vector2)
    # [1] TRUE
    !is.unsorted(vector3)
    # [1] FALSE
    

    This function is very fast as it appeals almost directly to compiled C code.

    My initial thought was to use sort and identical, a la identical(sort(vector1), vector1), but this is pretty slow; that said, I think this approach can be extended to more flexible situations.

    If speed was really crucial, we could skip some of the overhead of is.unsorted and call the internal function directly:

    .Internal(is.unsorted(vector1, FALSE))
    

    (the FALSE passes FALSE to the argument strictly). This offered a ~4x speed-up on a small vector.

    To get a sense of just how fast the final option is, here's a benchmark:

    library(microbenchmark)
    set.seed(10101)
    srtd <- sort(sample(1e6, rep = TRUE)) # a sorted test case
    unsr <- sample(1e6, rep = TRUE) #an unsorted test case
    
    microbenchmark(times = 1000L,
                   josilber = {all(diff(srtd) >= 0)
                             all(diff(unsr) >= 0)},
                   mikec = {identical(sort(srtd), srtd)
                          identical(sort(unsr), unsr)},
                   baser = {!is.unsorted(srtd)
                          !is.unsorted(unsr)},
                   intern = {!.Internal(is.unsorted(srtd, FALSE)) 
                           !.Internal(is.unsorted(unsr, FALSE))})
    

    Results on my machine:

    # Unit: microseconds
    #      expr       min         lq       mean     median        uq        max neval  cld
    #  josilber 30349.108 30737.6440 34550.6599 34113.5970 34964.171 155283.320  1000   c 
    #     mikec 93167.836 94183.8865 97119.4493 94852.7530 97528.859 229692.328  1000    d
    #     baser  1089.670  1168.7400  1322.9341  1296.7375  1347.946   6301.866  1000  b  
    #    intern   514.816   532.4405   576.2867   560.5955   566.236   2456.237  1000 a   
    

    So calling the internal function directly (caveat: you need to be sure your vector is perfectly clean--no NAs, etc.) gives you ~2x speed versus the base R function, which is in turn ~30x faster than using diff, which is in turn ~2x as fast as my initial choice.

    0 讨论(0)
提交回复
热议问题