问题
Its trivial to search a set of keywords in a certain website in a specific date range -> in the google search box you enter: desired-kewords site:desired-website then from the Tools menu you pick the date range. e.g. here "arab spring" search term in www.cnn.com between 1th Jan 2011 and 31th Dec 2013:
As you can see in the second picture there are about 773 results! the search URI looks like this :
https://www.google.co.nz/search?tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2011%2Ccd_max%3A12%2F31%2F2013&ei=iDcnWoy3Jsj38QW514S4Aw&q=arab+spring+site%3Awww.cnn.com&oq=arab+spring+site%3Awww.cnn.com&gs_l=psy-ab.12...0.0.0.6996.0.0.0.0.0.0.0.0..0.0....0...1c..64.psy-ab..0.0.0....0.a4-ff19obY4 the date range could be seen in cd_min and cd_max of tbs parameter(which apears in URI whenever the tools menu is used)
I would like to get the same functionality programmatically using Google's custom search api client for python. I defined a custom search engine:
Then tried different suggestions I found on the web/stack overflow:
this is a related questions which is left unanswered.
This post about Date range search using Google Custom Search API referred to here and suggests using the 'sort' parameter to do the favour (sort = 'date:r:yyyymmdd:yyyymmdd'). did not work: "totalResults" is "44900"
This post suggests using date restrict field which does not work as well.
Well! any working solution?
回答1:
I might be late, but for other people searching for the solution, you can try this:
from googleapiclient.discovery import build
my_api_key = "YOUR_API_KEY"
my_cse_id = "YOUR_CSE_ID"
def google_results_count(query):
service = build("customsearch", "v1",
developerKey=my_api_key)
result = service.cse().list(q=query, cx=my_cse_id, sort="date:r:20110101:20131231").execute()
return result["searchInformation"]["totalResults"]
print google_results_count('arab spring site:www.cnn.com')
This code will return around 1500+ results.
It is still far from the web results, Google has an explanation why.
Also, if you haven't setup your CSE to search the entire web, here's a guide on how to set it up.
P.S. If you still want to get the web version's result/data, you can just scrape it using BeautifulSoup or other libraries.
来源:https://stackoverflow.com/questions/47665573/specifying-a-date-range-in-google-custom-search-api