Finding local maxima and minima

前端 未结 14 1654
说谎
说谎 2020-11-22 07:24

I\'m looking for a computationally efficient way to find local maxima/minima for a large list of numbers in R. Hopefully without for loops...

For exampl

相关标签:
14条回答
  • 2020-11-22 08:10

    Use the zoo library function rollapply:

    x <- c(1, 2, 3, 2, 1, 1, 2, 1)
    library(zoo)
     xz <- as.zoo(x)
     rollapply(xz, 3, function(x) which.min(x)==2)
    #    2     3     4     5     6     7 
    #FALSE FALSE FALSE  TRUE FALSE FALSE 
     rollapply(xz, 3, function(x) which.max(x)==2)
    #    2     3     4     5     6     7 
    #FALSE  TRUE FALSE FALSE FALSE  TRUE 
    

    Then pull the index using the 'coredata' for those values where 'which.max' is a "center value" signaling a local maximum. You could obviously do the same for local minima using which.min instead of which.max.

     rxz <- rollapply(xz, 3, function(x) which.max(x)==2)
     index(rxz)[coredata(rxz)]
    #[1] 3 7
    

    I am assuming you do not want the starting or ending values, but if you do , you could pad the ends of your vectors before processing, rather like telomeres do on chromosomes.

    (I'm noting the ppc package ("Peak Probability Contrasts" for doing mass spectrometry analyses, simply because I was unaware of its availability until reading @BenBolker's comment above, and I think adding these few words will increase the chances that someone with a mass-spec interest will see this on a search.)

    0 讨论(0)
  • 2020-11-22 08:13

    I posted this elsewhere, but I think this is an interesting way to go about it. I'm not sure what its computational efficiency is, but it's a very concise way of solving the problem.

    vals=rbinom(1000,20,0.5)
    
    text=paste0(substr(format(diff(vals),scientific=TRUE),1,1),collapse="")
    
    sort(na.omit(c(gregexpr('[ ]-',text)[[1]]+1,ifelse(grepl('^-',text),1,NA),
     ifelse(grepl('[^-]$',text),length(vals),NA))))
    
    0 讨论(0)
  • 2020-11-22 08:17

    I took a stab at this today. I know you said hopefully without for loops but I stuck with using the apply function. Somewhat compact and fast and allows threshold specification so you can go greater than 1.

    The function:

    inflect <- function(x, threshold = 1){
      up   <- sapply(1:threshold, function(n) c(x[-(seq(n))], rep(NA, n)))
      down <-  sapply(-1:-threshold, function(n) c(rep(NA,abs(n)), x[-seq(length(x), length(x) - abs(n) + 1)]))
      a    <- cbind(x,up,down)
      list(minima = which(apply(a, 1, min) == a[,1]), maxima = which(apply(a, 1, max) == a[,1]))
    }
    

    To a visualize it/play with thresholds you can run the following code:

    # Pick a desired threshold # to plot up to
    n <- 2
    # Generate Data
    randomwalk <- 100 + cumsum(rnorm(50, 0.2, 1)) # climbs upwards most of the time
    bottoms <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$minima)
    tops <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$maxima)
    # Color functions
    cf.1 <- grDevices::colorRampPalette(c("pink","red"))
    cf.2 <- grDevices::colorRampPalette(c("cyan","blue"))
    plot(randomwalk, type = 'l', main = "Minima & Maxima\nVariable Thresholds")
    for(i in 1:n){
      points(bottoms[[i]], randomwalk[bottoms[[i]]], pch = 16, col = cf.1(n)[i], cex = i/1.5)
    }
    for(i in 1:n){
      points(tops[[i]], randomwalk[tops[[i]]], pch = 16, col = cf.2(n)[i], cex = i/1.5)
    }
    legend("topleft", legend = c("Minima",1:n,"Maxima",1:n), 
           pch = rep(c(NA, rep(16,n)), 2), col = c(1, cf.1(n),1, cf.2(n)), 
           pt.cex =  c(rep(c(1, c(1:n) / 1.5), 2)), cex = .75, ncol = 2)
    

    0 讨论(0)
  • 2020-11-22 08:18

    In the case I'm working on, duplicates are frequent. So I have implemented a function that allows finding first or last extrema (min or max):

    locate_xtrem <- function (x, last = FALSE)
    {
      # use rle to deal with duplicates
      x_rle <- rle(x)
    
      # force the first value to be identified as an extrema
      first_value <- x_rle$values[1] - x_rle$values[2]
    
      # differentiate the series, keep only the sign, and use 'rle' function to
      # locate increase or decrease concerning multiple successive values.
      # The result values is a series of (only) -1 and 1.
      #
      # ! NOTE: with this method, last value will be considered as an extrema
      diff_sign_rle <- c(first_value, diff(x_rle$values)) %>% sign() %>% rle()
    
      # this vector will be used to get the initial positions
      diff_idx <- cumsum(diff_sign_rle$lengths)
    
      # find min and max
      diff_min <- diff_idx[diff_sign_rle$values < 0]
      diff_max <- diff_idx[diff_sign_rle$values > 0]
    
      # get the min and max indexes in the original series
      x_idx <- cumsum(x_rle$lengths)
      if (last) {
        min <- x_idx[diff_min]
        max <- x_idx[diff_max]
      } else {
        min <- x_idx[diff_min] - x_rle$lengths[diff_min] + 1
        max <- x_idx[diff_max] - x_rle$lengths[diff_max] + 1
      }
      # just get number of occurences
      min_nb <- x_rle$lengths[diff_min]
      max_nb <- x_rle$lengths[diff_max]
    
      # format the result as a tibble
      bind_rows(
        tibble(Idx = min, Values = x[min], NB = min_nb, Status = "min"),
        tibble(Idx = max, Values = x[max], NB = max_nb, Status = "max")) %>%
        arrange(.data$Idx) %>%
        mutate(Last = last) %>%
        mutate_at(vars(.data$Idx, .data$NB), as.integer)
    }
    

    The answer to the original question is:

    > x <- c(1, 2, 3, 2, 1, 1, 2, 1)
    > locate_xtrem(x)
    # A tibble: 5 x 5
        Idx Values    NB Status Last 
      <int>  <dbl> <int> <chr>  <lgl>
    1     1      1     1 min    FALSE
    2     3      3     1 max    FALSE
    3     5      1     2 min    FALSE
    4     7      2     1 max    FALSE
    5     8      1     1 min    FALSE
    

    The result indicates that the second minimum is equal to 1 and that this value is repeated twice starting at index 5. Therefore, a different result could be obtained by indicating this time to the function to find the last occurrences of local extremas:

    > locate_xtrem(x, last = TRUE)
    # A tibble: 5 x 5
        Idx Values    NB Status Last 
      <int>  <dbl> <int> <chr>  <lgl>
    1     1      1     1 min    TRUE 
    2     3      3     1 max    TRUE 
    3     6      1     2 min    TRUE 
    4     7      2     1 max    TRUE 
    5     8      1     1 min    TRUE 
    

    Depending on the objective, it is then possible to switch between the first and the last value of a local extremas. The second result with last = TRUE could also be obtained from an operation between columns "Idx" and "NB"...

    Finally to deal with noise in the data, a function could be implemented to remove fluctuations below a given threshold. Code is not exposed since it goes beyond the initial question. I have wrapped it in a package (mainly to automate the testing process) and I give below a result example:

    x_series %>% xtrem::locate_xtrem()
    

    x_series %>% xtrem::locate_xtrem() %>% remove_noise()
    

    0 讨论(0)
  • 2020-11-22 08:19

    In the pracma package, use the

    tt <- c(1,2,3,2,1, 1, 2, 1)
    tt_peaks <- findpeaks(tt, zero = "0", peakpat = NULL,
           minpeakheight = -Inf, minpeakdistance = 1, threshold = 0, npeaks = 0, sortstr = FALSE)
    
      [,1] [,2] [,3] [,4]
      [1,]  3    3    1    5
      [2,]  2    7    6    8
    

    That returns a matrix with 4 columns. The first column is showing the local peaks' absolute values. The 2nd column are the indices The 3rd and 4th column are the start and end of the peaks (with potential overlap).

    See https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/findpeaks for details.

    One caveat: I used it in a series of non-integers, and the peak was one index too late (for all peaks) and I do not know why. So I had to manually remove "1" from my index vector (no big deal).

    0 讨论(0)
  • 2020-11-22 08:20

    Here's the solution for minima:

    @Ben's solution

    x <- c(1,2,3,2,1,2,1)
    which(diff(sign(diff(x)))==+2)+1 # 5
    

    Please regard the cases at Tommy's post!

    @Tommy's solution:

    localMinima <- function(x) {
      # Use -Inf instead if x is numeric (non-integer)
      y <- diff(c(.Machine$integer.max, x)) > 0L
      rle(y)$lengths
      y <- cumsum(rle(y)$lengths)
      y <- y[seq.int(1L, length(y), 2L)]
      if (x[[1]] == x[[2]]) {
        y <- y[-1]
      }
      y
    }
    
    x <- c(1,2,9,9,2,1,1,5,5,1)
    localMinima(x) # 1, 7, 10
    x <- c(2,2,9,9,2,1,1,5,5,1)
    localMinima(x) # 7, 10
    x <- c(3,2,9,9,2,1,1,5,5,1)
    localMinima(x) # 2, 7, 10
    

    Please regard: Neither localMaxima nor localMinima can handle duplicated maxima/minima at start!

    0 讨论(0)
提交回复
热议问题