Regex to match the first ending HTMl tag

前端 未结 3 452
醉酒成梦
醉酒成梦 2021-01-14 21:00

I am trying to write a regex which match the first ending form tag.

  [^~]* 

The above regex

相关标签:
3条回答
  • 2021-01-14 21:19

    Just make the pattern non-greedy so that it matches the smallest possible amount of characters instead of the largest possible:

    <form[^>]*name="loginForm"[^>]*>[^~]*?</form>
    

    Edit:
    Changed .* to [^>]* in the form tag, so that it doesn't match outside the tag.

    0 讨论(0)
  • 2021-01-14 21:23

    Use a real parser like DOMDocument, SimpleXML or SimpleHTMLDOM. Regular expressions are not suitable for parsing non-regular languages like HTML.

    0 讨论(0)
  • 2021-01-14 21:32

    You should NOT use regular expressions, but parse it with DOM:

    Javascript:

    var forms = document.getElementsByTagName('form');
    forms[0] // is the first form element.
    

    PHP:

    $dom = new DOMDocument();
    $dom->loadHTML( $html );
    $forms = $dom->getElementsByTagName('form');
    $first = $forms->item(0); // reference to first form
    

    You can use minidom and ElementTree for Python.

    0 讨论(0)
提交回复
热议问题