问题
Just joined SO so I was wondering if you can help me with this issue. We used to scrape a website and get all the contact information for crossfit gyms in the US/world as the information was pretty exposed out there. Now, however, they have changed their website to map.crossfit.com so the information is embedded within a google style map, so you can only actually get the information for each gym (name, address, phone #, etc.) by zooming in and choosing them one by one, which would take me forever just to get all the US ones (approximately 6,000).
I'm not an expert in programming so I'm assuming that if the information is still there, there should be a way to scrape it. Can you guys give tell me if that is possible and possibly give me some hints on that?
Really appreciate your help! Rick
回答1:
Hello you can use the next command=
curl 'https://map.crossfit.com/getAffiliateInfo?aid=9347'
The output looks like -
{"name":"CrossFit Radiate","website":"http://www.crossfitradiate.com/","address":"149 S. Fowler St","city":"Bishop","state":"CA","zip":"93514","country":"United States","cfkids":true,"phone":"(760) 920-7519","courses":[]}
yo will get a json with all the information about the gym...
-if you change the request in the variable aid=1-
the output--
"name":"Golden State CrossFit","website":"http://goldenstatecrossfit.com/","address":"11174 La Grange Ave","city":"Los Angeles","state":"CA","zip":"90025","country":"United States","cfkids":false,"phone":"(818) 665-6512","courses":[]}
--Make a foor loop--
And change the value adding +1 to the value
--The info can be parsed from json to csv, or excel, or
--Regards--
回答2:
You're apparently able to openly search by free text at this url:
https://map.crossfit.com/ac?term=alaska
Replace "alaska" with whatever you want, maybe a loop from a-z and you should have all results in 5 minutes. But I'm not sure they'll approve of such things and will probably take measures eventually.
回答3:
I would suggest a simple nodejs/express script and push the result in an array or object. Keep incrementing as long as one of their properties is not "null".
来源:https://stackoverflow.com/questions/33618324/web-scraping-google-map-website-is-it-possible-to-scrape