xmllint to parse a html file

后端 未结 1 1822
情书的邮戳
情书的邮戳 2021-01-18 05:31

I was trying to parse out text between specific tags on a mac in various html files. I was looking for the first

heading in the body. Example:

相关标签:
1条回答
  • 2021-01-18 06:02

    Try the --html option. Otherwise, xmllint parses your document as XML which is a lot stricter than HTML. Also note that XPath indices are 1-based and that HTML tags are converted to lowercase when parsing. The command

    xmllint --html --xpath '/html/body/h1[1]' - <<EOF
    <BODY>
    <H1>Dublin</H1>
    EOF
    

    prints

    <h1>Dublin</h1>
    
    0 讨论(0)
提交回复
热议问题