问题
I'm working on a project for which I need to extract some data from Google Scholar. My PHP program takes a string from my local machine, passes it to the Google Scholar and on the search results page it takes out the first result and saves it to the database.
I have to do this for almost 90 thousand strings/queries. The problem is that after a few hundred entries the program stops as the Google Scholar asks for captcha verification. What can I do about that?
回答1:
Because Google Scholar does not have an API, there is no documented way to do what you want. You are not supposed to scrape data like this, which is why you are running into Google's bot-protection features. I think that your only real option is to wait for Google to create an API.
来源:https://stackoverflow.com/questions/6180638/google-scholar-captcha-verification-problem