问题
My overall goal is to build a co-author network graph. I have a list of PubMed ID's and these are the only publications I am interested in for the graphing of the co-author network. I can't figure out how to get both the Author names and respective affiliations together in my query using rentrez. I can get both information but my list of affiliations is about 300 less than my author list so obviously some did not provide affiliations but I can't figure out who. Any way to search for Author and affiliation combined? [When I did both in my entrez_fetch, it just gave me a list of authors and affiliations separately so I still can't figure out which affiliations belong with which authors.]
library(tidyverse)
library(rentrez)
library(XML)
trial<-entrez_fetch(db="pubmed", id=pub.list$PMID, rettype="xml", parsed=TRUE)
affiliations<-xpathSApply(trial, "//Affiliation", xmlValue)
first.names<-xpathSApply(trial, "//Author/ForeName", xmlValue)
This all works fine but I can't figure out which authors are with which affiliations since their lengths are different.
Any help would be greatly appreciated. Thanks!
回答1:
You could try something like:
xpathSApply(trial, "//Author", function(x) {
author_name <- xmlValue(x[["LastName"]])
author_affiliation <- xmlValue(x[["AffiliationInfo"]][["Affiliation"]])
c(author_name,author_affiliation)
})
It returns in the first row the last name of the authors and in the second row their affiliation by getting these values for each //Author
node.
回答2:
last.name<-xpathSApply(trial, "//Author", function(x) {
author_name <- xmlValue(x[["LastName"]])})
affiliation<-xpathSApply(trial, "//Author", function(x) {
author_affiliation <- xmlValue(x[["AffiliationInfo"]][["Affiliation"]])})
This is what I ended up using, following NicE's format and it worked--I can see where the NA's for affiliations are now.
回答3:
I took @NicE 's code and @Shirley 's comments and wrote this code:
lastname_affiliation <-data.frame(cbind(
xpathSApply(trial, "//Author", function(x) {
author_name <- xmlValue(x[["LastName"]])
}),
xpathSApply(trial, "//Author", function(x) {
author_affiliation <- xmlValue(x[["AffiliationInfo"]][["Affiliation"]])
})
))
Thanks for putting me on the right path.
来源:https://stackoverflow.com/questions/42398935/using-rentrez-to-parse-out-author-and-affiliation-from-pubmed