how to extract intragroup and intergroup distances from a distance matrix? in R

后端 未结 1 1189
我在风中等你
我在风中等你 2021-01-15 12:55

I have this dataset:

values<-c(0.002,0.3,0.4,0.005,0.6,0.2,0.001,0.002,0.3,0.01)
codes<-c(\"A_1\",\"A_2\",\"A_3\",\"B_1\",\"B_2\",\"B_3\",\"B_4\",\"C_1         


        
1条回答
  •  说谎
    说谎 (楼主)
    2021-01-15 13:28

    They may call them matrices but they are really not. There is however an as.matrix function that will let you get matrix indexing:

    > as.matrix(dist.m)[grep("A", codes), grep("A", codes) ]
          A_1   A_2   A_3
    A_1 0.000 0.298 0.398
    A_2 0.298 0.000 0.100
    A_3 0.398 0.100 0.000
    

    So you can get the first part with pretty compact code:

    > sapply(LETTERS[1:3], function(let) as.matrix(dist.m)[grep(let, codes), grep(let, codes) ]
    + )
    $A
          A_1   A_2   A_3
    A_1 0.000 0.298 0.398
    A_2 0.298 0.000 0.100
    A_3 0.398 0.100 0.000
    
    $B
          B_1   B_2   B_3   B_4
    B_1 0.000 0.595 0.195 0.004
    B_2 0.595 0.000 0.400 0.599
    B_3 0.195 0.400 0.000 0.199
    B_4 0.004 0.599 0.199 0.000
    
    $C
          C_1   C_2   C_3
    C_1 0.000 0.298 0.008
    C_2 0.298 0.000 0.290
    C_3 0.008 0.290 0.000
    

    Then use negative logical addressing to get the rest:

    > sapply(LETTERS[1:3], function(let) as.matrix(dist.m)[grepl(let, codes), !grepl(let, codes) ]
    + )
    $A
          B_1   B_2   B_3   B_4   C_1   C_2   C_3
    A_1 0.003 0.598 0.198 0.001 0.000 0.298 0.008
    A_2 0.295 0.300 0.100 0.299 0.298 0.000 0.290
    A_3 0.395 0.200 0.200 0.399 0.398 0.100 0.390
    
    $B
          A_1   A_2   A_3   C_1   C_2   C_3
    B_1 0.003 0.295 0.395 0.003 0.295 0.005
    B_2 0.598 0.300 0.200 0.598 0.300 0.590
    B_3 0.198 0.100 0.200 0.198 0.100 0.190
    B_4 0.001 0.299 0.399 0.001 0.299 0.009
    
    $C
          A_1   A_2   A_3   B_1   B_2   B_3   B_4
    C_1 0.000 0.298 0.398 0.003 0.598 0.198 0.001
    C_2 0.298 0.000 0.100 0.295 0.300 0.100 0.299
    C_3 0.008 0.290 0.390 0.005 0.590 0.190 0.009
    

    I don't see a way of representing this as a two column data structure but you can use melt in pkg::reshape2 to get a three column structure:

    > melt( as.matrix(dist.m)[grep("A", codes), grep("A", codes) ] )
      Var1 Var2 value
    1  A_1  A_1 0.000
    2  A_2  A_1 0.298
    3  A_3  A_1 0.398
    4  A_1  A_2 0.298
    5  A_2  A_2 0.000
    6  A_3  A_2 0.100
    7  A_1  A_3 0.398
    8  A_2  A_3 0.100
    9  A_3  A_3 0.000
    

    That would give you a rather long dataframe for display but it would be easy enough to put melt inside the function call.

    0 讨论(0)
提交回复
热议问题