Is factoring an arrow out of arrow do notation a valid transformation?

巧了我就是萌 提交于 2019-12-14 03:57:56

问题


I'm trying to get my head around HXT, a Haskell library for parsing XML that uses arrows. For my specific use case I'd rather not use deep as there are cases where <outer_tag><payload_tag>value</payload_tag></outer_tag> is distinct from <outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag> but I ran into some weirdness that felt like it should work but doesn't.

I've managed to come up with a test case based on this example from the docs:

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
module Main where

import Text.XML.HXT.Core

data Guest = Guest { firstName, lastName :: String }
  deriving (Show, Eq)


getGuest = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< deep (hasName "fname") -< x
    lname <- getText <<< getChildren <<< deep (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest' = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") <<< getChildren -< x
    lname <- getText <<< getChildren <<< (hasName "lname") <<< getChildren -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest'' = deep (isElem >>> hasName "guest") >>> getChildren >>>
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") -< x
    lname <- getText <<< getChildren <<< (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }


driver finalArrow = runX (readDocument [withValidate no] "guestbook.xml" >>> finalArrow)

main = do 
  guests <- driver getGuest
  print "getGuest"
  print guests

  guests' <- driver getGuest'
  print "getGuest'"
  print guests'

  guests'' <- driver getGuest''
  print "getGuest''"
  print guests''

Between getGuest and getGuest' I expand deep into the correct number of getChildren. The resulting function still works. I then factor the getChildren outside of the do block but this causes the resulting function to fail. The output is:

"getGuest"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest'"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest''"
[]

I feel like this should be a valid transformation to perform, but my understanding of arrows is a little shaky. Am I doing something wrong? Is this a bug that I should report?

I'm using HXT version 9.3.1.3 (the latest at the time of writing). ghc --version prints "The Glorious Glasgow Haskell Compilation System, version 7.4.1". I've also tested on a box with ghc 7.6.3 and got the same result.

The XML file had the following repetitive structure (the full file can be found here)

<guestbook>
  <guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
  </guest>
  <guest>
    <fname>Henry</fname>
    <lname>Ford</lname>
  </guest>
  <guest>
    <fname>Andrew</fname>
    <lname>Carnegie</lname>
  </guest>
</guestbook>

回答1:


In getGuest'' you have

... (hasName "fname") -< x
... (hasName "lname") -< x

That is, you are restricting to the case where x is "fname" and x is "lname", which isn't satisfied by any x!




回答2:


I've managed to work out the specific reason that the construction is interpreted the way it is. The following arrow translation found here provides a base to work from

addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = proc x -> do
                y <- f -< x
                z <- g -< x
                returnA -< y + z

Becomes:

addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> y + z)

From this we can, by analogy, derive:

getGuest''' = preproc >>>
           arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> Guest {firstName = z, lastName = y})

    where preproc = deep (isElem >>> hasName "guest") >>> getChildren
        f = getText <<< getChildren <<< (hasName "fname")
        g = getText <<< getChildren <<< (hasName "lname")

In HXT, the arrows can be imagined as streams of values flowing through filters. arr (\x->(x,x)) does not "split the stream", as I'd hoped. Instead it creates a stream of tuples that are filtered by f and survivors are filtered by g. As f and g are mutually exclusive, there are no survivors.

Examples with getChildren inside miraculously worked because the tuple stream contained values from further up the XML document looking something like

<guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
</guest>

and so were not mutually exclusive.



来源:https://stackoverflow.com/questions/21995888/is-factoring-an-arrow-out-of-arrow-do-notation-a-valid-transformation

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!