Convert a dataframe to an object of class “dist” without actually calculating distances in R

前端 未结 4 1483
醉酒成梦
醉酒成梦 2020-12-04 01:45

I have a dataframe with distances

df<-data.frame(site.x=c(\"A\",\"A\",\"A\",\"B\",\"B\",\"C\"),   
site.y=c(\"B\",\"C\",\"D\",\"C\",\"D\",\"D\"),Distanc         


        
相关标签:
4条回答
  • 2020-12-04 01:54

    There is nothing stopping you from creating the dist object yourself. It is just a vector of distances with attributes that set up the labels, size, etc.

    Using your df, this is how

    dij2 <- with(df, Distance)
    nams <- with(df, unique(c(as.character(site.x), as.character(site.y))))
    attributes(dij2) <- with(df, list(Size = length(nams),
                                      Labels = nams,
                                      Diag = FALSE,
                                      Upper = FALSE,
                                      method = "user"))
    class(dij2) <- "dist"
    

    Or you can do this via structure() directly:

    dij3 <- with(df, structure(Distance,
                               Size = length(nams),
                               Labels = nams,
                               Diag = FALSE,
                               Upper = FALSE,
                               method = "user",
                               class = "dist"))
    

    These give:

    > df
      site.x site.y Distance
    1      A      B       67
    2      A      C       57
    3      A      D       64
    4      B      C       60
    5      B      D       67
    6      C      D       60
    > dij2
       A  B  C
    B 67      
    C 57 60   
    D 64 67 60
    > dij3
       A  B  C
    B 67      
    C 57 60   
    D 64 67 60
    

    Note: The above do no checking that the data are in the right order. Make sure you have the data in df in the correct order as you do in the example; i.e. sort by site.x then site.y before you run the code I show.

    0 讨论(0)
  • 2020-12-04 01:55

    For people coming in from google... The acast function in the reshape2 library is way easier for this kind of stuff.

    library(reshape2)
    acast(df, site.x ~ site.y, value.var='Distance', fun.aggregate = sum, margins=FALSE)
    
    0 讨论(0)
  • 2020-12-04 02:16

    ?as.dist() should help you, though it expects a matrix as input.

    0 讨论(0)
  • 2020-12-04 02:21

    I had a similar problem not to long ago and solved it like this:

    n <- max(table(df$site.x)) + 1  # +1,  so we have diagonal of 
    res <- lapply(with(df, split(Distance, df$site.x)), function(x) c(rep(NA, n - length(x)), x))
    res <- do.call("rbind", res)
    res <- rbind(res, rep(NA, n))
    res <- as.dist(t(res))
    
    0 讨论(0)
提交回复
热议问题