Regex, select closest match

后端 未结 3 582
迷失自我
迷失自我 2020-11-29 11:15

Assume the following word sequence

BLA text text text  text text text BLA text text text text LOOK text text text BLA text text BLA

What I

相关标签:
3条回答
  • 2020-11-29 11:51

    Another way to extract the desired text is to use the tempered greedy token technique, which matches a series of individual characters that do not begin an unwanted string.

    r'\bBLA\b(?:(?!\bBLA\b).)*\bLOOK\b'
    

    Start your engine! | Python code

    \bBLA\b        : match 'BLA' with word boundaries
    (?:            : begin non-capture group
      (?!\bBLA\b)  : negative lookahead asserts following characters are not
                     'BLA' with word boundaries
      .            : match any character
    )              : end non-capture group
    *              : execute non-capture group 0+ times
    \bLOOK\b       : match 'LOOK' with word boundaries
    

    Word boundaries are included to avoid matching words such as BLACK and TRAILBLAZER.

    0 讨论(0)
  • 2020-11-29 12:08
    (?s)BLA(?:(?!BLA).)*?LOOK
    

    Try this. See demo.

    Alternatively, use

    BLA(?:(?!BLA|LOOK)[\s\S])*LOOK
    

    To be safer.

    0 讨论(0)
  • 2020-11-29 12:12

    simply find text between LOOK and BLA without BLA

    In : re.search(r'BLA [^(BLA)]+ LOOK', 'BLA text text text  text text text BLA text text text text LOOK text text text BLA text text BLA').group()
    Out: 'BLA text text text text LOOK'
    

    :-)

    0 讨论(0)
提交回复
热议问题