I am using H2O with R to calculate the euclidean distance between 2 data.frames:
set.seed(121)
#create the data
df1<-data.frame(matrix(rnorm(1000),ncol=10))
df2<-data.frame(matrix(rnorm(300),ncol=10))
#init h2o
h2o.init()
#transform to h2o
df1.h<-as.h2o(df1)
df2.h<-as.h2o(df2)
if I use normal calculations, i.e. the first row:
distance1<-sqrt(sum((df1[1,]-df2[1,])^2))
And If I use the H2O library:
distance.h2o<-h2o.distance(df1.h[1,],df2.h[1,],"l2")
print(distance1)
print(distance.h2o)
The distance1 and distance.h2o are not the same. Does anybody knows why? Thanks!!
It seems as if h2o.distance
calculates the sum of squares, without taking the square root: so take the square root to get the standard result.
distance.h2o <- h2o.distance(df1.h[1,],df2.h[1,],"l2")
sqrt(distance.h2o)
来源:https://stackoverflow.com/questions/45782023/wrong-euclidean-distance-h2o-calculations-r