Make python re.sub look multiple times

前端 未结 2 1779
南笙
南笙 2021-01-29 07:46

Suppose I have the following code:

s = \'cucumber apple tomato\'

def f(match):
    if match.group(2) not in (\'apple\', ):
        return \'%s (%s)\' % (match.gr         


        
相关标签:
2条回答
  • 2021-01-29 08:05

    Use a capturing lookahead:

    >>> s = 'cucumber apple tomato'
    >>> re.findall(r'(\w+)(?=[ \t]+(\w+))', s)
    [('cucumber', 'apple'), ('apple', 'tomato')]
    

    That allows you to capture the second word in front of the first word without consuming the string.

    Which you can turn into (what I >> think <<) is your desired result:

    >>> [f'{t[0]} ({t[1]})' if t[1]=='apple' else t for t in re.findall(r'(\w+)(?=[ \t]+(\w+))', s)]
    ['cucumber (apple)', ('apple', 'tomato')]
    

    In your comments, you have a different example with a different pattern for an answer. For that result, just use optional matches:

    >>> s='cucumber apple tomato tomato apple cucumber tomato tomato'
    >>> [f'{t[0]} {t[1]} ({t[2]})' if t[2] else f'{t[0]} ({t[1]})' for t in re.findall(r'(\w+)(?:[ \t]+(\w+))?(?:[ \t]+(\w+))?', s)]
    ['cucumber apple (tomato)', 'tomato apple (cucumber)', 'tomato (tomato)']
    
    0 讨论(0)
  • 2021-01-29 08:21

    This is based on the information you've given in the comments so may not be exactly what you're looking for but:

    There can be any number of words: 'cucumber apple tomato tomato apple cucumber tomato tomato' and the output should be 'cucumber apple (tomato) tomato apple (cucumber) tomato (tomato)'

    This regex will capture all non space characters after "apple" and before the end of the line while ignoring words finishing in "apple" and allowing it to be the first in the line.

    (?:^| )apple ([^ ]*)|([^ ]+)$
    

    For the example string
    "apple cucumber pineapple tomato tomato apple cucumber tomato tomato"
    it will select
    "apple cucumber pineapple tomato tomato apple cucumber tomato tomato"

    0 讨论(0)
提交回复
热议问题