With Haskell, how do I process large volumes of XML?
问题 I've been exploring the Stack Overflow data dumps and thus far taking advantage of the friendly XML and “parsing” with regular expressions. My attempts with various Haskell XML libraries to find the first post in document-order by a particular user all ran into nasty thrashing. TagSoup import Control.Monad import Text.HTML.TagSoup userid = "83805" main = do posts <- liftM parseTags (readFile "posts.xml") print $ head $ map (fromAttrib "Id") $ filter (~== ("<row OwnerUserId=" ++ userid ++ ">")