Finding local maxima and minima

前端 未结 14 1676
说谎
说谎 2020-11-22 07:24

I\'m looking for a computationally efficient way to find local maxima/minima for a large list of numbers in R. Hopefully without for loops...

For exampl

14条回答
  •  悲&欢浪女
    2020-11-22 08:18

    In the case I'm working on, duplicates are frequent. So I have implemented a function that allows finding first or last extrema (min or max):

    locate_xtrem <- function (x, last = FALSE)
    {
      # use rle to deal with duplicates
      x_rle <- rle(x)
    
      # force the first value to be identified as an extrema
      first_value <- x_rle$values[1] - x_rle$values[2]
    
      # differentiate the series, keep only the sign, and use 'rle' function to
      # locate increase or decrease concerning multiple successive values.
      # The result values is a series of (only) -1 and 1.
      #
      # ! NOTE: with this method, last value will be considered as an extrema
      diff_sign_rle <- c(first_value, diff(x_rle$values)) %>% sign() %>% rle()
    
      # this vector will be used to get the initial positions
      diff_idx <- cumsum(diff_sign_rle$lengths)
    
      # find min and max
      diff_min <- diff_idx[diff_sign_rle$values < 0]
      diff_max <- diff_idx[diff_sign_rle$values > 0]
    
      # get the min and max indexes in the original series
      x_idx <- cumsum(x_rle$lengths)
      if (last) {
        min <- x_idx[diff_min]
        max <- x_idx[diff_max]
      } else {
        min <- x_idx[diff_min] - x_rle$lengths[diff_min] + 1
        max <- x_idx[diff_max] - x_rle$lengths[diff_max] + 1
      }
      # just get number of occurences
      min_nb <- x_rle$lengths[diff_min]
      max_nb <- x_rle$lengths[diff_max]
    
      # format the result as a tibble
      bind_rows(
        tibble(Idx = min, Values = x[min], NB = min_nb, Status = "min"),
        tibble(Idx = max, Values = x[max], NB = max_nb, Status = "max")) %>%
        arrange(.data$Idx) %>%
        mutate(Last = last) %>%
        mutate_at(vars(.data$Idx, .data$NB), as.integer)
    }
    

    The answer to the original question is:

    > x <- c(1, 2, 3, 2, 1, 1, 2, 1)
    > locate_xtrem(x)
    # A tibble: 5 x 5
        Idx Values    NB Status Last 
            
    1     1      1     1 min    FALSE
    2     3      3     1 max    FALSE
    3     5      1     2 min    FALSE
    4     7      2     1 max    FALSE
    5     8      1     1 min    FALSE
    

    The result indicates that the second minimum is equal to 1 and that this value is repeated twice starting at index 5. Therefore, a different result could be obtained by indicating this time to the function to find the last occurrences of local extremas:

    > locate_xtrem(x, last = TRUE)
    # A tibble: 5 x 5
        Idx Values    NB Status Last 
            
    1     1      1     1 min    TRUE 
    2     3      3     1 max    TRUE 
    3     6      1     2 min    TRUE 
    4     7      2     1 max    TRUE 
    5     8      1     1 min    TRUE 
    

    Depending on the objective, it is then possible to switch between the first and the last value of a local extremas. The second result with last = TRUE could also be obtained from an operation between columns "Idx" and "NB"...

    Finally to deal with noise in the data, a function could be implemented to remove fluctuations below a given threshold. Code is not exposed since it goes beyond the initial question. I have wrapped it in a package (mainly to automate the testing process) and I give below a result example:

    x_series %>% xtrem::locate_xtrem()
    

    x_series %>% xtrem::locate_xtrem() %>% remove_noise()
    

提交回复
热议问题