Scala: Parse HTML-fragment

Deadly 提交于 2019-12-11 12:13:34

问题


Our database stores HTML fragments like f.ex. <p>A.</p><p>B.</p>. I want to include the Html fragements from the database into a Lift snippet.

To do that, I tried to use the XML.loadString()-method to convert the fragement into a scala.xml.Elem, but this only works for full valid XML-documents:

import scala.xml.XML
@Test
def doesnotWork() {
  val result = XML.loadString("<p>A</p><p>B</p>")
  assert(result === <p>A</p><p>B</p>)
}

@Test
def thisWorks() {
  val result = XML.loadString("<test><p>A</p><p>B</p></test>")
  assert(result === <test><p>A</p><p>B</p></test>)
}

The test doesnotWork results in an exception:

org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 10; The markup in the document following the root element must be well-formed.

Is it possible to convert just (valid) fragements to XML?


回答1:


Since you're using Lift, you can wrap your XML in lift:children as a workaround. The Children snippet simply returns the element's children; and is very useful for wrapping fragments you need to parse.

@Test
def thisAlsoWorks() {
  val result = XML.loadString("<lift:children><p>A</p><p>B</p></lift:children>")
  assert(result === <lift:children><p>A</p><p>B</p></lift:children>)
 }



回答2:


You don't need a full valid XML document, but you do need a single top-level tag.

As you observed, the following works:

XML.loadString("<fragment><p>A</p><p>B</p></fragment>")

You could then either store a sequence of Elems, or wrap them in a custom tag and extract the sequence using .descendant.



来源:https://stackoverflow.com/questions/9377234/scala-parse-html-fragment

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!