问题
I'd like to use xml-conduit
to parse GPX files. So far I've got the following:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Data.Text as T
import Text.XML
import Text.XML.Cursor
data Trkpt = Trkpt {
trkptLat :: Text,
trkptLon :: Text,
trkptEle :: Text,
trkptTime :: Text
} deriving (Show)
trkptsFromFile path =
gpxTrkpts . fromDocument <$> Text.XML.readFile def path
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/0}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkpt" >=>
child >=> \e -> do
let ele = T.concat $ element "{http://www.topografix.com/GPX/1/0}ele" e >>= descendant >>= content
let time = T.concat $ element "{http://www.topografix.com/GPX/1/0}time" e >>= descendant >>= content
let lat = T.concat $ attribute "lat" e
let lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
A sample GPX file is here.
I'm getting strange results where the parsed text is mostly empty, with some sporadic actual values, although the original GPX file data is all valid. When there is an actual value, it is only in one of the fields of the record.
I'm quite certain I'm not using the xml-conduit
API properly. What am I doing wrong?
回答1:
Two issues. Firstly, there is a typo in the namespace; it should be http://www.topografix.com/GPX/1/1
. Secondly, your final Kleisli arrow (\e -> do -- etc.
) is acting on the children of the trkpt
elements, rather than on the trkpt
themselves. Here is a gpxTrkpts
which should do what you want:
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/1}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkpt" >=>
\e -> do
let cs = child e
ele = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}ele" >>= descendant >>= content
time = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}time" >>= descendant >>= content
lat = T.concat $ attribute "lat" e
lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
回答2:
@duplode has pointed out the problem. Here are some more comments.
How about using the gpx-conduit package
Here is some code which can help debug parsing problems:
Code:
{-# LANGUAGE OverloadedStrings #-}
module Lib2 where
import qualified Data.Text as T
import Data.Text (Text)
import Text.XML
import Text.XML.Cursor
import qualified Filesystem.Path.CurrentOS as Path
import Control.Monad
showNode (NodeElement e) = "NodeEement " ++ T.unpack (nameLocalName $ elementName e)
showNode (NodeInstruction _) = "NodeInstruction ..."
showNode (NodeContent t) = "NodeContent " ++ show t
showNode (NodeComment _) = "NodeComment"
testParser parser = do
content <- Text.XML.readFile def (Path.decodeString "sample.xml")
let nodes = map node $ parser (fromDocument content)
forM_ nodes $ \n -> putStrLn (showNode n)
Use it in ghci like this:
ghci> :set -XOverloadedStrings
ghci> :l Lib2
Lib2> testParser child
NodeContent "\n "
NodeEement metadata
NodeContent "\n "
NodeEement trk
NodeContent "\n "
NodeEement extensions
NodeContent "\n"
Lib2> testParser $ child >=> element "trk"
Lib2> testParser $ child >=> laxElement "trk"
NodeEement trk
Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg"
NodeElement trkseg
Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg" >=> child >=> laxElement "trkpt"
NodeEement trkpt
NodeEement trkpt
NodeEement trkpt
NodeEement trkpt
Lib2>
来源:https://stackoverflow.com/questions/31897737/how-to-parse-gpx-file-using-haskells-xml-conduit