问题
My question is almost identical to this one except instead of finding the closest value between a column value and a fixed number, e.g. "2", I want to find the closest value to the value in another column.. Here's an example of data:
df <- data.frame(site_no=c("01010500", "01010500", "01010500","02010500", "02010500", "02010500", "03010500", "03010500", "03010500"),
OBS=c(423.9969, 423.9969, 423.9969, 123, 123, 123, 150,150,150),
MOD=c(380,400,360,150,155,135,170,180,140),
HT=c(14,12,15,3,8,19,12,23,10))
Which looks like this:
site_no OBS MOD HT
1 01010500 423.9969 380 14
2 01010500 423.9969 400 12
3 01010500 423.9969 360 15
4 02010500 123.0000 150 3
5 02010500 123.0000 155 8
6 02010500 123.0000 135 19
7 03010500 150.0000 170 12
8 03010500 150.0000 180 23
9 03010500 150.0000 140 10
The goal is, for every "site_no", find the closest MOD value that matches the OBS value, then return the corresponding HT. For example, for site_no 01010500, 423.9969 - 400 yields the minimum difference, and thus the function would return 12. I have tried most of the solutions from the other post, but get an error due to $ with atomic vector (the df is recursive, but I think the function is not). I tried:
ddply(df, .(site_no), function(z) {
z[abs(z$OBS - z$MOD) == min(abs(z$OBS - z$MOD)), ]
})
Error in z$River_Width..m. - z$chan_width :
non-numeric argument to binary operator
回答1:
After grouping by 'site_no', we slice
the rows which has the minimum absolute difference between the 'OBS' and 'MOD'
library(dplyr)
res <- df %>%
group_by(site_no) %>%
slice(which.min(abs(OBS-MOD)))
NOTE: By using dplyr
, some additional classes like tbl_df
tibble
etc. are added which should work with most other functions. If there is any problem, we can convert it to data.frame with as.data.frame
str(res %>%
as.data.frame)
#'data.frame': 3 obs. of 4 variables:
#$ site_no: Factor w/ 3 levels "01010500","02010500",..: 1 2 3
#$ OBS : num 424 123 150
#$ MOD : num 400 135 140
#$ HT : num 12 19 10
来源:https://stackoverflow.com/questions/46367939/return-value-based-on-finding-closest-value-between-other-two-columns-in-df