Getting Everything Between Two Characters Across New Lines

后端 未结 1 1610
青春惊慌失措
青春惊慌失措 2021-01-22 04:47

This is a sample of the text I am working with.

6) Jake\'s Taxi Service is a new entrant to the taxi industry. It has achieved success by staking out a u

1条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-22 05:11

    The reason your regex fails is that you read the file line by line with for line in myfile:, while your pattern searches for matches in a single multiline string.

    Replace for line in myfile: with contents = myfile.read() and then use result = question_pattern.search(contents) to get the first match, or result = question_pattern.findall(contents) to get multiple matches.

    A note on the regex: I am not fixing the whole pattern since, as you mentioned, it is out of scope of this question, but since the string input is a multiline string now, you need to remove re.DOTALL and use [\s\S] to match any char in the pattern and . to match any char but a line break char. Also, the lookaround contruct is redundant, you may safely replace (?=Answer) with Answer. Also, to check if there is a match, you may simply use if result: and then grab the whole match value by accessing result.group().

    Full code snippet:

    with open ('StratMasterKey.txt', 'rt') as myfile:
        contents = myfile.read()
        question_pattern = re.compile((rf'(?<={searchCounter}\) )[\s\S]*?Answer.*')) 
        result = question_pattern.search(contents)
        if result: 
            print( result.group() )
    

    0 讨论(0)
提交回复
热议问题