strip HTML Tags with perl

前端 未结 5 718
粉色の甜心
粉色の甜心 2020-12-17 03:02

Whats the easiest way to strip the HTML tags in perl. I am using a regular expression to parse HTML from a URL which works great but how can I strip the HTML tags off?

5条回答
  •  时光说笑
    2020-12-17 03:51

    If you just want to remove HTML tags:

    s///sg
    s/<.+?>//sg
    

    This will (most of the time) remove script tags and their contents, and all other HTML tags. You could also probably remove everything before the tag safely with regex.

    For anything more complex than that, though, regular expressions are not a suitable tool, and you really need to parse the HTML with an actual HTML parser and then manipulate that to remove the tags.

提交回复
热议问题