问题
I want to calculate the distance a node to the root dtr
. All I have is a vector, that contains the parent node id for each node rel
(in this example id == 7
is root):
library(tidyverse)
tmp <- tibble(
id = 1:12,
rel = c(2,7,4,2,4,5,7,7,10,8,7,7)
)
In the end I'm looking for this result:
tmp$dtr
[1] 2 1 3 2 3 4 0 1 3 2 1 1
So far I was able to write the following algorithm until I got stuck when trying to reference a different row in my code.
The algorithm should work like this (Pseudocode):
- If not root, increment
dtr
:if(!equals(tid,trel)): dtr = dtr+1
- Change
tid
totrel
:tid = trel
- Change
trel
to to therel
value whereid == trel
- If any
!equals(tid,trel)
GOTO 1., else END
First I added 2 helper columns to store temporary information:
tmp <- tmp %>%
mutate(
tid = id,
trel = rel,
dtr = 0
)
The first two steps in the algorithm work like this:
tmp <- tmp %>%
mutate(
dtr = if_else(
!equals(tid,trel),
dtr + 1,
dtr
),
tid = trel
)
The 3rd step I'm not sure about.... I tried to achieve it with the following code, but that does not work:
tmp <- tmp %>%
mutate(trel = rel[id == .$tid])
The result is (of course) wrong:
tmp$rel
[1] 7 7 7 7 7 7 7 7 7 7 7 7
But why not this? (Should be the right solution when running 3. the first time):
[1] 2 7 2 7 2 4 7 7 10 8 7 7
The 4th step is done by checking if I have more than one unique value in trel:
while(length(unique(tmp$trel)) > 1){
...
}
Thus the full algorithm should somewhat look like this:
get_dtr <- function(tib){
tmp <- tib %>%
mutate(
tid = id,
trel = rel,
dtr = 0
)
while(length(unique(tmp$trel)) > 1){
tmp <- tmp %>%
mutate(
dtr = if_else(
!equals(tid,trel),
dtr + 1,
dtr
),
tid = trel
)
### Step 3
}
tmp
}
Any idea how to solve this or a simpler solution? Thanks in advance!
回答1:
This is basically already implemented in the tidygraph
package. If you are going to be working with graph-like data with the tidyverse you should look there first. you can do
library(tidygraph)
as_tbl_graph(tmp, directed=FALSE) %>%
activate(nodes) %>%
mutate(depth=bfs_dist(root=7)) %>%
as_tibble()
# name depth
# <chr> <int>
# 1 1 2
# 2 2 1
# 3 3 3
# 4 4 2
# 5 5 3
# 6 6 4
# 7 7 0
# 8 8 1
# 9 9 3
# 10 10 2
# 11 11 1
# 12 12 1
回答2:
If you want to write a function yourself, you can use the following code:
library(tidyverse)
tmp <- tibble(
id = 1:12,
rel = c(2,7,4,2,4,5,7,7,10,8,7,7)
)
calc_dtr <- function(id, tmp){
# find root
root <- tmp$id[tmp$id == tmp$rel]
# is this the root node?
if(id == root){return(0)}
# initialize counter
dtr <- 1
trel <- tmp$rel[tmp$id == id]
while(trel != root){
dtr <- dtr + 1
trel <- tmp$rel[tmp$id == trel]
}
return(dtr)
}
tmp %>%
mutate(
dtr = map_dbl(id, calc_dtr, tmp)
)
This produces the following output:
# A tibble: 12 x 3
id rel dtr
<int> <dbl> <dbl>
1 1 2 2
2 2 7 1
3 3 4 3
4 4 2 2
5 5 4 3
6 6 5 4
7 7 7 0
8 8 7 1
9 9 10 3
10 10 8 2
11 11 7 1
12 12 7 1
来源:https://stackoverflow.com/questions/48063592/mutate-value-by-using-a-value-from-a-different-row-in-a-tibble