问题
What I intend to do with my data frame using the R language is to plot it as a network of objects that are connected when someone answers 'YES' to both objects, for example, my data would look similar to:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22
Fish Yes Yes Yes No No Yes No No No Yes Yes No Yes Yes No No Yes No No No No No
Squid Yes Yes No No No Yes No No No Yes No No Yes No No No No No No No No No
Pigs No No Yes No No No No No No No No No No No No No No No No No No No
So in my network, I'd like the vertices to be [Fish, Squid, Pigs] and for a connection to be formed per row, so for example: from the first answer, Fish and Squid would be connected. So it would look something like
Fish - Squid
Fish - Squid
Fish - Pigs
So eventually the data would be the all the objects connected to each other from the sum of all the yes/no answers of the 22 people. I would also want to use the igraph library to plot this data network. I sort of understand how to use igraph so I just really need help with how to create the edge using the yes/no answers. Any advice or help would be appreciated!
*sorry about the vague data table, i'm not sure how to make one on the SO question box.
回答1:
This is the output from dput() as @Thomas suggested. This is a great way to share data with folks on stackoverflow, so that they can easily grab it and test code out on it.
df <- structure(list(V1 = c("Yes", "Yes", "No"), V2 = c("Yes", "Yes",
"No"), V3 = c("Yes", "No", "Yes"), V4 = c("No", "No", "No"),
V5 = c("No", "No", "No"), V6 = c("Yes", "Yes", "No"), V7 = c("No",
"No", "No"), V8 = c("No", "No", "No"), V9 = c("No", "No",
"No"), V10 = c("Yes", "Yes", "No"), V11 = c("Yes", "No",
"No"), V12 = c("No", "No", "No"), V13 = c("Yes", "Yes", "No"
), V14 = c("Yes", "No", "No"), V15 = c("No", "No", "No"),
V16 = c("No", "No", "No"), V17 = c("Yes", "No", "No"), V18 = c("No",
"No", "No"), V19 = c("No", "No", "No"), V20 = c("No", "No",
"No"), V21 = c("No", "No", "No"), V22 = c("No", "No", "No"
)), .Names = c("V1", "V2", "V3", "V4", "V5", "V6", "V7",
"V8", "V9", "V10", "V11", "V12", "V13", "V14", "V15", "V16",
"V17", "V18", "V19", "V20", "V21", "V22"), class = "data.frame",
row.names = c("Fish", "Squid", "Pigs"))
Here's some code to create an edge data frame.
# define the vertices
vertices <- row.names(df)
L <- length(vertices)
# set up empty data frame for the edges
numedges <- choose(L, 2)
edges <- data.frame(v1=rep(NA, numedges), v2=NA, numrows=NA)
# cycle through all possible pairs of vertices
# total up the number of people answering "Yes" to both
k <- 0
for(i in 1:(L-1)) {
for(j in (i+1):L) {
k <- k + 1
edges$v1[k] <- vertices[i]
edges$v2[k] <- vertices[j]
edges$numrows[k] <- sum(df[vertices[i], ]=="Yes" & df[vertices[j], ]=="Yes")
}}
This is what the resulting edge data frame looks like:
v1 v2 numrows
1 Fish Squid 5
2 Fish Pigs 1
3 Squid Pigs 0
来源:https://stackoverflow.com/questions/20048279/vertex-connection-by-yes-no-in-r-language