Closest value to a specific column in R

前端 未结 4 939
自闭症患者
自闭症患者 2021-02-13 06:26

I would like to find the closest value to column x3 below.

data=data.frame(x1=c(24,12,76),x2=c(15,30,20),x3=c(45,27,15))
data
  x1 x2 x3
1 24 15 45
2 12 30 27
3          


        
相关标签:
4条回答
  • 2021-02-13 06:35

    A tidyverse solution:

    data %>%
      rowid_to_column() %>%
      gather(var, val, -c(x3, rowid)) %>%
      mutate(temp = x3 - val) %>%
      group_by(rowid) %>%
      filter(abs(temp) == min(abs(temp))) %>%
      ungroup() %>%
      select(val)
    
        val
      <dbl>
    1    24
    2    30
    3    20
    

    First, it adds a row ID. Second, it transforms the data from wide to long. Third, it calculates the difference between "x3" and the other variables. Finally, it groups by the row ID and keeps the rows where the absolute difference is the smallest.

    Or:

    data %>%
      rowid_to_column() %>%
      gather(var, val, -c(x3, rowid)) %>%
      mutate(temp = x3 - val) %>%
      group_by(rowid) %>%
      filter(abs(temp) == min(abs(temp))) %>%
      ungroup() %>%
      pull(val)
    
    [1] 24 30 20
    

    Or using an approach originally proposed by @markus (it assumes that your columns are named "x"):

    data %>%
     mutate(temp = paste0("x", max.col(-abs(.[, -3] - .[, 3])))) %>%
     rowwise() %>%
     summarise(val = eval(as.symbol(temp)))
    
        val
      <dbl>
    1   24.
    2   30.
    3   20.
    

    First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest and combines it with "x". Then, it evaluates the combination of x and column index as a variable and returns the appropriate value.

    Also borrowing the idea from @markus (not assuming that your columns are named "x"):

    data %>%
     mutate(temp = max.col(-abs(.[, -3] - .[, 3]))) %>%
     rowwise %>%
     mutate(temp = names(.)[[temp]]) %>%
     summarise(val = eval(as.symbol(temp)))
    

    First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest. Second, it returns the column name based on the column index. Finally, it evaluates it as a variable and returns the appropriate value.

    Or a variant where you can reference the "x3" variable by its name and not by column index (the basic idea still from @markus):

    data %>%
     mutate(temp = max.col(-abs(.[, !grepl("x3", colnames(.))] - .[, grepl("x3", colnames(.))]))) %>% 
     rowwise %>%
     mutate(temp = names(.)[[temp]]) %>%
     summarise(val = eval(as.symbol(temp)))
    
    0 讨论(0)
  • 2021-02-13 06:48

    Define a function closest_to_3 that operates on a vector and returns the value in the vector that's closest to the third member:

    closest_to_3 <- function(v) v[-3][which.min(abs( v[-3]-v[3] ))]
    

    (The idiom v[-3] deletes the 3rd member from v.) Then apply this function to each row of your data frame:

    apply(data, 1, closest_to_3)
    #[1] 24 30 20
    
    0 讨论(0)
  • 2021-02-13 06:51

    Here is another approach using matrixStats

    x <- as.matrix(data[,-3L])
    y <- abs(x - .subset2(data, 3L))
    x[matrixStats::rowMins(y) == y]
    # [1] 24 30 20
    

    Or in base using vapply

    x <- as.matrix(data[,-3L])
    y <- abs(x - .subset2(data, 3L))
    vapply(1:nrow(data), 
           function(k) x[k,][which.min(y[k,])], 
           numeric(1))
    # [1] 24 30 20
    
    0 讨论(0)
  • 2021-02-13 06:53

    Use max.col(-abs(data[, 3] - data[, -3])) to find the column positions of the closest values and use this result as part of a matrix to extract desired values from your data. The matrix is returned by cbind

    col <- 3
    data[, -col][cbind(1:nrow(data),
                       max.col(-abs(data[, col] - data[, -col])))]
    #[1] 24 30 20
    
    0 讨论(0)
提交回复
热议问题