I would like to find the closest value to column x3 below.
data=data.frame(x1=c(24,12,76),x2=c(15,30,20),x3=c(45,27,15))
data
x1 x2 x3
1 24 15 45
2 12 30 27
3
A tidyverse
solution:
data %>%
rowid_to_column() %>%
gather(var, val, -c(x3, rowid)) %>%
mutate(temp = x3 - val) %>%
group_by(rowid) %>%
filter(abs(temp) == min(abs(temp))) %>%
ungroup() %>%
select(val)
val
<dbl>
1 24
2 30
3 20
First, it adds a row ID. Second, it transforms the data from wide to long. Third, it calculates the difference between "x3" and the other variables. Finally, it groups by the row ID and keeps the rows where the absolute difference is the smallest.
Or:
data %>%
rowid_to_column() %>%
gather(var, val, -c(x3, rowid)) %>%
mutate(temp = x3 - val) %>%
group_by(rowid) %>%
filter(abs(temp) == min(abs(temp))) %>%
ungroup() %>%
pull(val)
[1] 24 30 20
Or using an approach originally proposed by @markus (it assumes that your columns are named "x"):
data %>%
mutate(temp = paste0("x", max.col(-abs(.[, -3] - .[, 3])))) %>%
rowwise() %>%
summarise(val = eval(as.symbol(temp)))
val
<dbl>
1 24.
2 30.
3 20.
First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest and combines it with "x". Then, it evaluates the combination of x and column index as a variable and returns the appropriate value.
Also borrowing the idea from @markus (not assuming that your columns are named "x"):
data %>%
mutate(temp = max.col(-abs(.[, -3] - .[, 3]))) %>%
rowwise %>%
mutate(temp = names(.)[[temp]]) %>%
summarise(val = eval(as.symbol(temp)))
First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest. Second, it returns the column name based on the column index. Finally, it evaluates it as a variable and returns the appropriate value.
Or a variant where you can reference the "x3" variable by its name and not by column index (the basic idea still from @markus):
data %>%
mutate(temp = max.col(-abs(.[, !grepl("x3", colnames(.))] - .[, grepl("x3", colnames(.))]))) %>%
rowwise %>%
mutate(temp = names(.)[[temp]]) %>%
summarise(val = eval(as.symbol(temp)))
Define a function closest_to_3
that operates on a vector and returns the value in the vector that's closest to the third member:
closest_to_3 <- function(v) v[-3][which.min(abs( v[-3]-v[3] ))]
(The idiom v[-3]
deletes the 3rd member from v
.) Then apply this function to each row of your data frame:
apply(data, 1, closest_to_3)
#[1] 24 30 20
Here is another approach using matrixStats
x <- as.matrix(data[,-3L])
y <- abs(x - .subset2(data, 3L))
x[matrixStats::rowMins(y) == y]
# [1] 24 30 20
Or in base
using vapply
x <- as.matrix(data[,-3L])
y <- abs(x - .subset2(data, 3L))
vapply(1:nrow(data),
function(k) x[k,][which.min(y[k,])],
numeric(1))
# [1] 24 30 20
Use max.col(-abs(data[, 3] - data[, -3]))
to find the column positions of the closest values and use this result as part of a matrix to extract desired values from your data. The matrix is returned by cbind
col <- 3
data[, -col][cbind(1:nrow(data),
max.col(-abs(data[, col] - data[, -col])))]
#[1] 24 30 20