Get location coordinates using bing or google API in python

前端 未结 3 655
北恋
北恋 2020-12-30 17:19

Here is my problem. I have a sample text file where I store the text data by crawling various html pages. This text contains information about various events and its time an

相关标签:
3条回答
  • 2020-12-30 17:41

    You really have two questions:

    1. How to extract location text (or potential location text).
    2. How to get location (latitude, longitude) by calling a Geocoding service with location text.

    I can help with the second question. (But see edit below for some help with your first question.)

    With the old Google Maps API (which is still working), you could get the geocoding down to one line (one ugly line):

    def geocode(address):
        return tuple([float(s) for s in list(urllib.urlopen('http://maps.google.com/maps/geo?' + urllib.urlencode({'output': 'csv','q': address})))[0].split(',')[2:]])
    

    Check out the Google Maps API Geocoding Documentation:

    Here’s the readable 7 line version plus some wrapper code (when calling from the command line remember to enclose address in quotes):

    import sys
    import urllib
    
    googleGeocodeUrl = 'http://maps.google.com/maps/geo?'
    
    def geocode(address):
        parms = {
            'output': 'csv',
            'q': address}
    
        url = googleGeocodeUrl + urllib.urlencode(parms)
        resp = urllib.urlopen(url)
        resplist = list(resp)
        line = resplist[0]
        status, accuracy, latitude, longitude = line.split(',')
        return latitude, longitude
    
    def main():
        if 1 < len(sys.argv):
            address = sys.argv[1]
        else:
            address = '1600 Amphitheatre Parkway, Mountain View, CA 94043, USA'
    
        coordinates = geocode(address)
        print coordinates
    
    if __name__ ==  '__main__':
        main()
    

    It's simple to parse the CSV format, but the XML format has better error reporting.

    Edit - Help with your first question

    I looked in to nltk. It's not trivial, but I can recommend Natural Language Toolkit Documentation, CH 7 - Extracting Information from Text, specifically, 7.5 Named Entity Recognition. At the end of the section, they point out:

    NLTK provides a classifier that has already been trained to recognize named entities, accessed with the function nltk.ne_chunk(). If we set the parameter binary=True , then named entities are just tagged as NE; otherwise, the classifier adds category labels such as PERSON, ORGANIZATION, and GPE.

    You're specifying True, but you probably want the category labels, so:

    chunked_sentences = nltk.batch_ne_chunk(tagged_sentences)
    

    This provides category labels (named entity type), which seemed promising. But after trying this on your text and a few simple phrases with location, it's clear more rules are needed. Read the documentation for more info.

    0 讨论(0)
  • 2020-12-30 17:44

    The operation you want to do is called a geocode operation. Of course you will have to extract the 'location' information by your self inside the block of textual information.

    You can do it using the service from:

    • Bing Maps: http://msdn.microsoft.com/en-us/library/ff701714.aspx
    • Google Maps: https://developers.google.com/maps/documentation/geocoding/
    • Nokia Maps: http://developer.here.net/javascript_api_explorer

    Please keep in mind that you should consider license that might applies to you depending on your use cases.

    0 讨论(0)
  • 2020-12-30 17:57

    Since September 2013, Google Maps API v2 no longer works. Here is an updated version of great @jimhark's code, working for API v3 (I left out the __main__ part):

    import urllib
    import simplejson
    
    googleGeocodeUrl = 'http://maps.googleapis.com/maps/api/geocode/json?'
    
    def get_coordinates(query, from_sensor=False):
        query = query.encode('utf-8')
        params = {
            'address': query,
            'sensor': "true" if from_sensor else "false"
        }
        url = googleGeocodeUrl + urllib.urlencode(params)
        json_response = urllib.urlopen(url)
        response = simplejson.loads(json_response.read())
        if response['results']:
            location = response['results'][0]['geometry']['location']
            latitude, longitude = location['lat'], location['lng']
            print query, latitude, longitude
        else:
            latitude, longitude = None, None
            print query, "<no results>"
        return latitude, longitude
    

    See official documentation for the complete list of parameters and additional information.

    0 讨论(0)
提交回复
热议问题