I\'m new to python and I was wondering how string comparison is done
Let\'s say I have a list of strings containing state names like
states = [\"New York
states = ["New York", "California", "Nebraska", "Idaho"]
postal_addr = "1234 1st E St San Jose California 95112"
result = None
for state in states:
if state in postal_addr:
result = state
print(result)
Unfortunately, this will also match words that contain a state name such as Idahoba.
To find all matches in the string you could do:
matches = [m for m in postal_addr.split() if m in states]
I would do
matches = [ s for s in states if s in postal_addr ]
Then, if you want to get the string from the postal address:
import re
if matches:
extracted = re.findall( matches[0], postal_addr)[0]
EDIT: ..but this won't work for city/state combos where the city name contains a different state, for example if postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129'
and states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"]
etc. In this case
import re
if matches:
extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
extracted = sorted( extracted )[-1][1]
Here's another alternative answer using a regexp:
import re
states = ["New York", "California", "Nebraska", "Idaho"]
pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')
postal_addr = "1234 1st E St San Jose California 95112"
match = pattern.match(postal_addr)
if match:
state = match.group(1)
You can try like this,
In [2]: states = ["New York", "California", "Nebraska", "Idaho"]
In [3]: postal_addr = "1234 1st E St San Jose California 95112"
In [4]: ''.join(state for state in states if state in postal_addr)
Out[4]: 'California'
>>> states = ["New York", "California", "Nebraska", "Idaho"]
>>> postal_addr = "1234 1st E St San Jose California 95112"
>>> first_match = next(state for state in states if state in postal_addr)
>>> first_match
'California'
However, if you need to match at word boundaries, you might be better off using a regex.