问题
Calculating Euclidean Distances in R is easy. A good example can be found HERE. The vectorised form is:
sqrt((known_data[, 1] - unknown_data[, 1])^2 + (known_data[, 2] - unknown_data[, 2])^2)
What would be the fastest, most efficient way to get Euclidean Distances for each row of one data frame with all rows of another data frame? A particular function from apply()
family? Thanks!
回答1:
Maybe you can try outer
+ dist
like below
outer(
1:nrow(known_data),
1:nrow(unknown_data),
FUN = Vectorize(function(x,y) dist(rbind(known_data[x,],unknown_data[y,])))
)
回答2:
I would use the dist()
function (which is very efficient) on the combination of the two data frames and then remove the unneeded distances, if you like. Example:
df1 <- iris[1:5, -5]
df2 <- iris[6:10, -5]
all_distances <- dist(rbind(df1, df2))
all_distances <- as.matrix(all_distances)
# remove unneeded distances
all_distances[1:5, 1:5] <- NA
all_distances[6:10, 6:10] <- NA
来源:https://stackoverflow.com/questions/64269505/euclidean-distances-between-rows-of-two-data-frames-in-r