how to remove text between [removed] and [removed] using python?

后端 未结 9 649
眼角桃花
眼角桃花 2021-02-04 19:19

how to remove text between using python?

9条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-02-04 20:02

    According to answers posted by Pev and wr, why not to upgrade a regular expression, e.g.:

    pattern = r"(?is)]*>(.*?)"
    text = """"""
    re.sub(pattern, '', text)
    

    (?is) - added to ignore case and allow new lines in text. This version should also support script tags with attributes.

    EDIT: I can't add any comments yet, so I'm just editing my answer. I totally agree with the comment below, regexps are totally wrong for such tasks and b. soup ot lxml are a lot better. But question asked gave just a simple example and regexps should be enough for such simple task. Using Beautiful Soup for a simple text removing could just be too much (overload? I don't how to express what I mean, excuse my english).

    BTW I made a mistake, the code should look like this:

    pattern = r"(?is)(]*>)(.*?)()"
    text = """"""
    re.sub(pattern, '\1\3', text)
    

提交回复
热议问题