问题
With the following code
{-# LANGUAGE Arrows #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
import Text.XML.HXT.Core
parseXml :: IOSArrow XmlTree XmlTree
parseXml = getChildren >>> getChildren >>>
proc x -> do
y <- x >- hasName "item"
returnA -< x
main :: IO ()
main = do
person <- runX (readString [withValidate no]
"<xml><item>John</item><item2>Smith</item2></xml>"
>>> parseXml)
putStrLn $ show person
return ()
I get the output
[NTree (XTag "item" []) [NTree (XText "John") []]]
So it seems that hasName "item"
was applied to x
which I did not expect. Using arrowp I get for parseXml
:
parseXml
= getChildren >>> getChildren >>>
(arr (\ x -> (x, x)) >>>
(first (hasName "item") >>> arr (\ (y, x) -> x)))
So I have the arrow diagram
y
/-- hasName "item" ---
x /
-- getChildren -- getChildren ---\x->(x,x) \(y,x)->x --- final result
\ /
\---------------------/
Why is hasName "item"
also applied to second place of the tuple? I thought there is no state in haskell and hasName "item" x
returns a new object instead of changing the internal state of x
.
Related question: Is factoring an arrow out of arrow do notation a valid transformation?
My original problem
I have the following code:
{-# LANGUAGE Arrows #-}
import Text.XML.HXT.Core
data Person = Person { forname :: String, surname :: String } deriving (Show)
parseXml :: IOSArrow XmlTree Person
parseXml = proc x -> do
forname <- x >- this /> this /> hasName "fn" /> getText
surname <- x >- this /> this /> hasName "sn" /> getText
returnA -< Person forname surname
main :: IO ()
main = do
person <- runX (readString [withValidate no]
"<p><fn>John</fn><sn>Smith</sn></p>"
>>> parseXml)
putStrLn $ show person
return ()
If I run it everything works fine and I get the output
[Person {forname = "John", surname = "Smith"}]
But if I change the parseXml
to avoid the this
statements
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>> proc x -> do
forname <- x >- hasName "fn" /> getText
surname <- x >- hasName "sn" /> getText
returnA -< Person forname surname
no person can be parsed anymore (output is []
). Investigating the problem with
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>>
proc x -> do
forname <- x >- withTraceLevel 5 traceTree >>> hasName "fn" /> getText
surname <- x >- hasName "sn" /> getText
returnA -< Person forname surname
I got the output
content of:
============
---XTag "fn"
|
+---XText "John"
content of:
============
---XTag "sn"
|
+---XText "Smith"
[]
So everything seems fine, but with the code
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>>
proc x -> do
forname <- x >- hasName "fn" /> getText
surname <- x >- withTraceLevel 5 traceTree >>> hasName "sn" /> getText
returnA -< Person forname surname
I got
content of:
============
---XTag "fn"
|
+---XText "John"
[]
So it seems to me, that the value of the input x
changes between the two statements. It looks like the hasName "fn"
was applied to x
before it was attached to the surname
arrow. But shall x
not remain the same between the two lines?
回答1:
No, the input can't change and it doesn't.
What you've programmed in the lines
proc x -> do
y <- x >- hasName "item"
returnA -< x
is just a filter removing all nodes not named item
.
His is equivalent to the arrow
hasName "item" `guards` this
You can test this with
{-# LANGUAGE Arrows #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
module Main where
import Text.XML.HXT.Core
parseXml0 :: IOSArrow XmlTree XmlTree
parseXml0 = getChildren >>> getChildren >>>
proc x -> do
_ <- hasName "item" -< x
returnA -< x
parseXml1 :: IOSArrow XmlTree XmlTree
parseXml1 = getChildren >>> getChildren >>>
(hasName "item" `guards` this)
main1 :: Show c => IOSArrow XmlTree c -> IO ()
main1 parseXml = do
person <- runX (readString [withValidate no]
"<xml><item>John</item><item2>Smith</item2></xml>"
>>> parseXml)
putStrLn $ show person
return ()
main :: IO ()
main = main1 parseXml0 >> main1 parseXml1
回答2:
EDIT: OK, well now you've complete changed your question!
The working example should be interpreted as follows:
For the top level tag x
- iterate over all the texts (
getText
) of grandchildren (this /> this
) where the name is"fn"
(hasName "fn"
), usingforname
to hold these values - iterate over all the texts (
getText
) of grandchildren (this /> this
) where the name is"sn"
(hasName "sn"
), usingsurname
to hold these values - yield
Person forname surname
for each such pair
This looks like it works, but is perhaps not doing what you think it's doing. Try running the code on the input "<p><fn>John</fn><sn>Smith</sn><fn>Anne</fn><sn>Jones</sn></p>"
for example. Four names are printed.
The broken example should be interpreted as follows:
For every grandchild x
- if
x
has the name"fn"
then store the text inforname
(otherwise skip to the nextx
) - if
x
has the name"sn"
then store the text insurname
(otherwise skip to the nextx
)
A tag can't have the name "fn"
and the name "sn"
! Thus every tag is skipped.
Your investigation is just showing the point of the computation at which the tags are skipped. In the first case both the tags are present, as nothing has been filtered yet. In the second case only the "fn"
tag is present because the first command has filtered everything else out.
EDIT: You may find this example (done in terms of the list monad) instructive.
import Control.Monad ((>=>))
data XML = Text String | Tag String [XML] deriving Show
this :: a -> [a]
this = return
(/>) :: (a -> [XML]) -> (XML -> [c]) -> a -> [c]
f /> g = f >=> getChildren >=> g
(>--) :: a -> (a -> b) -> b
x >-- f = f x
getChildren :: XML -> [XML]
getChildren (Text _) = []
getChildren (Tag _ c) = c
hasName :: String -> XML -> [XML]
hasName _ (Text _) = []
hasName n i@(Tag n' _) = if n == n' then [i] else []
getText :: XML -> [String]
getText (Text t) = [t]
getText (Tag _ _) = []
parseXML :: XML -> [(String, String)]
parseXML = \x -> do
forname <- x >-- (this /> this /> hasName "fn" /> getText)
surname <- x >-- (this /> this /> hasName "sn" /> getText)
return (forname, surname)
parseXMLBroken :: XML -> [(String, String)]
parseXMLBroken = getChildren >=> getChildren >=> \x -> do
forname <- x >-- (hasName "fn" /> getText)
surname <- x >-- (hasName "sn" /> getText)
return (forname, surname)
runX :: (XML -> a) -> XML -> a
runX f xml = f (Tag "/" [xml])
xml :: XML
xml = (Tag "p" [ Tag "fn" [Text "John"]
, Tag "sn" [Text "Smith"] ])
example1 = runX parseXML xml
example2 = runX parseXMLBroken xml
*Main> example1
[("John","Smith")]
*Main> example2
[]
来源:https://stackoverflow.com/questions/22701238/hxt-can-an-input-change-with-the-arrow-syntax