问题
I am trying to input over 200 entries into pubmed in order to record the number of articles published by an author and to refine the search by including his/her mentor and institution. I have tried to do this using biopython and xlrd (the code is below), but I am consistently getting 0 results for all three formats of inquiries (1. by name, 2. by name and institution name, and 3. by name and mentor's name). Are there steps of troubleshooting that I can do, or should I use a different format when using the keywords indicated below to search on pubmed?
Example output of the input queries;search_term is a linked list with lists of the input queries.
print(*search_term[8:15], sep='\n')
[text:'Andrew Bland', 'Weill Cornell Medical College', text:'David Cutler MD']
[text:'Andy Price', 'University of Alabama at Birmingham School of Medicine', text:'Jason Warem, PhD']
[text:'Bah Chamin', 'University of Texas Southwestern Medical School', text:'Dr. Timothy Hillar']
[text:'Eduo Cera', 'University of Colorado School of Medicine', text:'Dr. Tim']
Code used to generate the input queries above and to search on Pubmed:
Entrez.email = "mollyzhaoe@college.harvard.edu"
for search_term in search_terms[8:55]:
handle = Entrez.egquery(term="{0} AND ((2010[Date - Publication] : 2017[Date - Publication])) ".format(search_term[0]))
handle_1 = Entrez.egquery(term = "{0} AND ((2010[Date - Publication] : 2017[Date - Publication])) AND {1}".format(search_term[0], search_term[2]))
handle_2 = Entrez.egquery(term = "{0} AND ((2010[Date - Publication] : 2017[Date - Publication])) AND {1}".format(search_term[0], search_term[1]))
record = Entrez.read(handle)
record_1 = Entrez.read(handle_1)
record_2 = Entrez.read(handle_2)
pubmed_count = ['','','']
for row in record["eGQueryResult"]:
if row["DbName"] == "pubmed":
pubmed_count[0] = row["Count"]
for row in record_1["eGQueryResult"]:
if row["DbName"] == "pubmed":
pubmed_count[1] = row["Count"]
for row in record_2["eGQueryResult"]:
if row["DbName"] == "pubmed":
pubmed_count[2] = row["Count"]
回答1:
Check your indentation, it is difficult to know which part belongs to which loop.
If you want to troubleshoot, try printing your egquery
, e.g.
print("{0} AND ((2010[Date - Publication] : 2017[Date - Publication])) ".format(search_term[0]))
and paste the output to pubmed and see what you get. Perhaps modify it a bit and see which search term causes the problems.
Your input format is a little bit hard to guess. Print the query and make sure you are getting the right search values.
For the author names, try to get rid of the academic titles, PubMed might confused them with the initials, e.g. House MD, might be Mark David House.
来源:https://stackoverflow.com/questions/40161460/searching-on-pubmed-using-biopython