Specifying a Date Range in Google Custom Search api

冷暖自知 提交于 2019-12-08 10:10:39

问题


Its trivial to search a set of keywords in a certain website in a specific date range -> in the google search box you enter: desired-kewords site:desired-website then from the Tools menu you pick the date range. e.g. here "arab spring" search term in www.cnn.com between 1th Jan 2011 and 31th Dec 2013:

As you can see in the second picture there are about 773 results! the search URI looks like this :

https://www.google.co.nz/search?tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2011%2Ccd_max%3A12%2F31%2F2013&ei=iDcnWoy3Jsj38QW514S4Aw&q=arab+spring+site%3Awww.cnn.com&oq=arab+spring+site%3Awww.cnn.com&gs_l=psy-ab.12...0.0.0.6996.0.0.0.0.0.0.0.0..0.0....0...1c..64.psy-ab..0.0.0....0.a4-ff19obY4 the date range could be seen in cd_min and cd_max of tbs parameter(which apears in URI whenever the tools menu is used)

I would like to get the same functionality programmatically using Google's custom search api client for python. I defined a custom search engine:

Then tried different suggestions I found on the web/stack overflow:

  • this is a related questions which is left unanswered.

  • This post about Date range search using Google Custom Search API referred to here and suggests using the 'sort' parameter to do the favour (sort = 'date:r:yyyymmdd:yyyymmdd'). did not work: "totalResults" is "44900"

  • This post suggests using date restrict field which does not work as well.

Well! any working solution?


回答1:


I might be late, but for other people searching for the solution, you can try this:

from googleapiclient.discovery import build

my_api_key = "YOUR_API_KEY"
my_cse_id = "YOUR_CSE_ID"

def google_results_count(query):
    service = build("customsearch", "v1",
                    developerKey=my_api_key)
    result = service.cse().list(q=query, cx=my_cse_id, sort="date:r:20110101:20131231").execute()
    return result["searchInformation"]["totalResults"]

print google_results_count('arab spring site:www.cnn.com')

This code will return around 1500+ results.

It is still far from the web results, Google has an explanation why.

Also, if you haven't setup your CSE to search the entire web, here's a guide on how to set it up.

P.S. If you still want to get the web version's result/data, you can just scrape it using BeautifulSoup or other libraries.



来源:https://stackoverflow.com/questions/47665573/specifying-a-date-range-in-google-custom-search-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!