Python Regex can't find substring but it should

前端 未结 2 887
情深已故
情深已故 2021-01-25 20:49

I am trying to parse html using BeautifulSoup to try and extract the webpage title. Sometimes this does not work due to the website being badly written, such as Bad End tag. W

2条回答
  •  无人共我
    2021-01-25 21:09

    If you want to grab the test between the </code> and <code><\title></code> tags you should use this regexp:</p> <pre><code>pattern = "<title>([^<]+)" re.findall(pattern, html_string)

提交回复
热议问题