Finding whether a string starts with one of a list's variable-length prefixes

后端 未结 11 2074
傲寒
傲寒 2021-01-01 12:16

I need to find out whether a name starts with any of a list\'s prefixes and then remove it, like:

if name[:2] in [\"i_\", \"c_\", \"m_\", \"l_\", \"d_\", \"t         


        
相关标签:
11条回答
  • 2021-01-01 12:49
    for prefix in prefixes:
        if name.startswith(prefix):
            name=name[len(prefix):]
            break
    
    0 讨论(0)
  • 2021-01-01 12:49

    Regex, tested:

    import re
    
    def make_multi_prefix_matcher(prefixes):
        regex_text = "|".join(re.escape(p) for p in prefixes)
        print repr(regex_text)
        return re.compile(regex_text).match
    
    pfxs = "x ya foobar foo a|b z.".split()
    names = "xenon yadda yeti food foob foobarre foo a|b a b z.yx zebra".split()
    
    matcher = make_multi_prefix_matcher(pfxs)
    for name in names:
        m = matcher(name)
        if not m:
            print repr(name), "no match"
            continue
        n = m.end()
        print repr(name), n, repr(name[n:])
    

    Output:

    'x|ya|foobar|foo|a\\|b|z\\.'
    'xenon' 1 'enon'
    'yadda' 2 'dda'
    'yeti' no match
    'food' 3 'd'
    'foob' 3 'b'
    'foobarre' 6 're'
    'foo' 3 ''
    'a|b' 3 ''
    'a' no match
    'b' no match
    'z.yx' 2 'yx'
    'zebra' no match
    
    0 讨论(0)
  • 2021-01-01 12:50

    What about using filter?

    prefs = ["i_", "c_", "m_", "l_", "d_", "t_", "e_", "b_"]
    name = list(filter(lambda item: not any(item.startswith(prefix) for prefix in prefs), name))
    

    Note that the comparison of each list item against the prefixes efficiently halts on the first match. This behaviour is guaranteed by the any function that returns as soon as it finds a True value, eg:

    def gen():
        print("yielding False")
        yield False
        print("yielding True")
        yield True
        print("yielding False again")
        yield False
    
    >>> any(gen()) # last two lines of gen() are not performed
    yielding False
    yielding True
    True
    

    Or, using re.match instead of startswith:

    import re
    patt = '|'.join(["i_", "c_", "m_", "l_", "d_", "t_", "e_", "b_"])
    name = list(filter(lambda item: not re.match(patt, item), name))
    
    0 讨论(0)
  • 2021-01-01 12:58

    This edits the list on the fly, removing prefixes. The break skips the rest of the prefixes once one is found for a particular item.

    items = ['this', 'that', 'i_blah', 'joe_cool', 'what_this']
    prefixes = ['i_', 'c_', 'a_', 'joe_', 'mark_']
    
    for i,item in enumerate(items):
        for p in prefixes:
            if item.startswith(p):
                items[i] = item[len(p):]
                break
    
    print items
    

    Output

    ['this', 'that', 'blah', 'cool', 'what_this']
    
    0 讨论(0)
  • 2021-01-01 13:00

    Could use a simple regex.

    import re
    prefixes = ("i_", "c_", "longer_")
    re.sub(r'^(%s)' % '|'.join(prefixes), '', name)
    

    Or if anything preceding an underscore is a valid prefix:

    name.split('_', 1)[-1]
    

    This removes any number of characters before the first underscore.

    0 讨论(0)
  • 2021-01-01 13:01

    If you define prefix to be the characters before an underscore, then you can check for

    if name.partition("_")[0] in ["i", "c", "m", "l", "d", "t", "e", "b", "foo"] and name.partition("_")[1] == "_":
        name = name.partition("_")[2]
    
    0 讨论(0)
提交回复
热议问题