Python - match a word in a string with a list of strings

后端 未结 6 2150
北恋
北恋 2021-01-25 22:20

I\'m new to python and I was wondering how string comparison is done

Let\'s say I have a list of strings containing state names like

states = [\"New York         


        
相关标签:
6条回答
  • 2021-01-25 22:32
    states = ["New York", "California", "Nebraska", "Idaho"]
    postal_addr = "1234 1st E St San Jose California 95112"
    
    result = None
    for state in states:
        if state in postal_addr:
            result = state
    
    print(result)
    

    Unfortunately, this will also match words that contain a state name such as Idahoba.

    0 讨论(0)
  • 2021-01-25 22:34

    To find all matches in the string you could do:

    matches = [m for m in postal_addr.split() if m in states]
    
    0 讨论(0)
  • 2021-01-25 22:37

    I would do

    matches = [ s for s in states if s in postal_addr ]
    

    Then, if you want to get the string from the postal address:

    import re
    if matches:
        extracted = re.findall( matches[0],  postal_addr)[0]
    

    EDIT: ..but this won't work for city/state combos where the city name contains a different state, for example if postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129' and states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"] etc. In this case

    import re
    if matches:
        extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
        extracted = sorted( extracted )[-1][1]
    
    0 讨论(0)
  • 2021-01-25 22:38

    Here's another alternative answer using a regexp:

    import re
    
    states = ["New York", "California", "Nebraska", "Idaho"]
    pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')
    
    postal_addr = "1234 1st E St San Jose California 95112"
    match = pattern.match(postal_addr)
    
    if match:
        state = match.group(1)
    
    0 讨论(0)
  • 2021-01-25 22:39

    You can try like this,

    In [2]: states = ["New York", "California", "Nebraska", "Idaho"]
    
    In [3]: postal_addr = "1234 1st E St San Jose California 95112"
    
    In [4]: ''.join(state for state in states if state in postal_addr)
    Out[4]: 'California'
    
    0 讨论(0)
  • 2021-01-25 22:50
    >>> states = ["New York", "California", "Nebraska", "Idaho"]
    >>> postal_addr = "1234 1st E St San Jose California 95112"
    >>> first_match = next(state for state in states if state in postal_addr)
    >>> first_match
    'California'
    

    However, if you need to match at word boundaries, you might be better off using a regex.

    0 讨论(0)
提交回复
热议问题