问题
I am using geopy to get latitude - longitude pairs for city names. For single queries, this works fine. What I try to do now is iterating through a big list of city names (46.000) and getting geocodes for each city. Afterwards, I run them through a check loop which sorts the city (if it is in the US) in the correct state. My problem is, that I get "GeocoderTimedOut('Service timed out')" all the time, everything is pretty slow and I'm not sure if that is my fault or just geopys nature. Here is the responsible code snippet:
for tweetcount in range(number_of_tweets):
#Get the city name from the tweet
city = data_dict[0]['tweetList'][tweetcount]['user']['location']
#Sort out useless tweets
if(len(city)>3 and not(city is None)):
# THE RESPONSIBLE LINE, here the error occurs
location = geolocator.geocode(city);
# Here the sorting into the state takes place
if location is not None:
for statecount in range(len(data)):
if point_in_poly(location.longitude, location.latitude, data[statecount]['geometry']):
state_tweets[statecount] += 1;
break;
Somehow, this one line throws timeouts at every 2./3. call. City has the form of "Manchester", "New York, New York" or something similar. I already had try - except blocks around everything, but that doesn't really change anything about the problem, so I removed them for now... Any ideas would be great!
回答1:
You will be at the mercy of whatever geolocator service you are using. geopy
is just a wrapper around different web-services and hence may fail if the server is busy. I would create a wrapper around the geolocator.geocode
call, something like this:
def geocode(city, recursion=0):
try:
return geolocator.geocode(city)
except GeocoderTimedOut as e:
if recursion > 10: # max recursions
raise e
time.sleep(1) # wait a bit
# try again
return geocode(city, recursion=recursion + 1)
This will try again 10 times, after a delay of 1 second. Adjust these numbers to your liking.
If you repeatably ask for the same city, you should consider wrapping it in some kind of memoizing e.g. this decorator. Since you have not posted a runnable code, I have not been able to test this.
回答2:
You should change your line :
location = geolocator.geocode(city);
to
location = geolocator.geocode(city,timeout=None);
来源:https://stackoverflow.com/questions/31506272/geopy-too-slow-timeout-all-the-time