Performing complicated XPath queries in Scala

前端 未结 5 1141
灰色年华
灰色年华 2021-01-04 08:14

What\'s the simplest API to use in scala to perform the following XPath queries on a document?

//s:Annotation[@type=\'attitude\']/s:Content/s:Parameter[@role         


        
相关标签:
5条回答
  • 2021-01-04 08:51
    //s:Annotation[@type='attitude']/s:Content/s:Parameter[@role='type' and not(text())]
    

    Well, I don't understand the s: notation, and couldn't find it on XPath spec either. However, ignoring that this would look like this:

    (
      (xml 
        \\ "Annotation" 
        filter (_ \ "@type" contains Text("x"))
      ) 
      \ "Content" 
      \ "Parameter" 
      filter (el => (el \ "@type" contains Text("type")) && el.isInstanceOf[Text])
    )
    

    Note the necessity of parenthesis because of higher precedence of \ over filter. I have changed the formatting to a multi-line expression as the Scala equivalent is just way too verbose for a single line.

    I can't answer about namespaces, though. No clue how to work with them on searches, if it's even possible. The docs mention @{uri}attribute for prefixed attributes, not does not mention anything about prefixed elements. Also, note that you need to pass an uri which resolves to the namespace you want, as literal namespaces in search are not supported.

    0 讨论(0)
  • 2021-01-04 09:03

    I think I'm going to go with lightly pimping XOM. It's a bit of a shame the XOM authors decided against exposing collections of child nodes and the like, but they had more work and less advantage to doing so in Java than in Scala. (And it is an otherwise well-designed library.)

    EDIT: I wound up pimping JDOM after all, because XOM doesn't compile XPath queries ahead of time. Since most of my effort was directed towards XPath this time, I was able to come up with a good model that sidesteps most of the generics issues. It shouldn't be too hard to come up with reasonable genericized versions of the methods getChildren and getAttributes and getAdditionalNamespaces in org.jdom.Element (by pimping the library with new methods that have slightly changed names.) I don't think there's a fix for getContent, and I'm not sure about getDescendants.

    0 讨论(0)
  • 2021-01-04 09:13

    I would suggest using kantan.xpath:

     import kantan.xpath._
     import kantan.xpath.implicits._
    
     input.evalXPath[List[String]](xp"/annotation[@type='attitude']/content/parameter[@role='type' and not(text())]/@value")
    

    This yields:

    res1: kantan.xpath.XPathResult[List[String]] = Success(List(foobar))
    
    0 讨论(0)
  • 2021-01-04 09:15

    I guess when scalaxmljaxen is mature, we'll be able to do this reliably on scala's built-in XML classes.

    0 讨论(0)
  • 2021-01-04 09:16

    Scales Xml adds both string based full XPath evaluation and an internal DSL providing a fairly complete coverage for querying

    0 讨论(0)
提交回复
热议问题