I am trying to manipulate a data frame. As an example: say I have a dataframe containing customers and the shops they visit:
df = data.frame(customers = c("a", "b", "b", "c", "c"),
shop_visited = c("X", "X", "Y", "X", "Z"))
customers shop_visited
a X
b X
b Y
c X
c Z
Summarizing this dataframe:
- one customer (
b
) shops atX
and also atY
; - one customer (
b
) shops atY
and also atX
; - one customer (
c
) shops atX
and also atZ
; - one customer (
c
) shops atZ
and also atX
Or, more succinctly:
relations = data.frame(source = c("X","Y", "X", "Z"),
target = c("Y","X","Z","X"))
source target
X Y
Y X
X Z
Z X
I am looking for a method that will be able to do the transformation df -> relations
. The motivation behind this is that I can then use relations
as the edges
argument in write.gexf
. Cheers for any help.
df <- data.frame(customers = c("a", "b", "b", "c", "c"),
shop_visited = c("X", "X", "Y", "X", "Z"))
#create an identifier df
dfnames <- data.frame(i = as.numeric(df$shop_visited),
shop_visited = df$shop_visited)
library(tnet)
tdf <- as.tnet( cbind(df[,2],df[,1]),type = "binary two-mode tnet" )
relations <- projecting_tm(tdf, method = "sum")
# match original names
relations[["i"]] <- dfnames[ match(relations[['i']], dfnames[['i']] ) , 'shop_visited']
relations[["j"]] <- dfnames[ match(relations[['j']], dfnames[['i']] ) , 'shop_visited']
# clean up names
names(relations) <- c("source" , "target", "weight")
#> relations
# source target weight
#1 X Y 1
#2 X Z 1
#3 Y X 1
#4 Z X 1
Please take a look to the function edge.list
of rgexf
(http://www.inside-r.org/packages/cran/rgexf/docs/edge.list). Using your example it would be something like this
library(rgexf)
# Your data
df = data.frame(customers = c("a", "b", "b", "c", "c"),
shop_visited = c("X", "X", "Y", "X", "Z"))
# Getting nodes and edges
df2 <- edge.list(df)
Looks like this
> df2
$nodes
id label
1 1 1
2 2 2
3 3 3
$edges
[,1] [,2]
[1,] 1 1
[2,] 2 1
[3,] 2 2
[4,] 3 1
[5,] 3 3
Finally, you can use this to write a GEXF graph
# Building the graph
write.gexf(nodes=df2$nodes, edges=df2$edges)
<?xml version="1.0" encoding="UTF-8"?>
<gexf xmlns="http://www.gexf.net/1.2draft" xmlns:viz="http://www.gexf.net/1.1draft/viz" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.gexf.net/1.2draft http://www.gexf.net/1.2draft/gexf.xsd" version="1.2">
<meta lastmodifieddate="2013-08-06">
<creator>NodosChile</creator>
<description>A graph file writing in R using "rgexf"</description>
<keywords>gexf graph, NodosChile, R, rgexf</keywords>
</meta>
<graph mode="static">
<nodes>
<node id="1" label="1"/>
<node id="2" label="2"/>
<node id="3" label="3"/>
</nodes>
<edges>
<edge id="0" source="1" target="1" weight="1.0"/>
<edge id="1" source="2" target="1" weight="1.0"/>
<edge id="2" source="2" target="2" weight="1.0"/>
<edge id="3" source="3" target="1" weight="1.0"/>
<edge id="4" source="3" target="3" weight="1.0"/>
</edges>
</graph>
</gexf>
Please let me know if you have any doubt george dot vega at nodoschile.org
Best!
George (creator of rgexf
)
来源:https://stackoverflow.com/questions/16173990/r-gephi-manipulating-dataframe-to-use-with-write-gexf