问题
This might have been asked before in different forms.
My use case is that i'm trying to generate http request from a chrome/firefox extension to a google search results page (by just the url of a search). Tried postman rest client for chrome, however google home page pops up. Thinking if there is something google dynamically doing to prevent seeing the results in the returned page? Isn't that request just like the client machine's request? I am able to paste the url in a different browser and it works. Any thoughts on this and/or know someone who's already done this?
Other solutions are to get a web service that has a web scraper like scrapy (python) or some java based and crawls with multiple proxy nodes there to get results. However didn't want to go that route, unless its the only one.
NPAPI for chrome: seems unsafe
During this it hit me, how are all the new search engines able to produce the same if not better results than google - and google has been around way longer tweaking its algorithms optimally and crawling the web. The new algorithms could not compete with the sophistication google might have achieved by now, even if the fundamental was the same. Blekko, Duckduckgo, bing and many more - all seem to produce pretty similar results as google as well. Aren't they just merely calling google from their numerous proxy nodes to simulate user behavior? Or have crawled google enough to already build their results.
Any help or thoughts would be great.
来源:https://stackoverflow.com/questions/27453071/automatically-getting-google-search-results-from-extension