I am reading Programming In Haskell, in the 8th chapter, the author gives an example of writing parsers.
The full source is here: http://www.cs.nott.ac.uk/~gmh/Parsing.lhs
I can't understand the following part: many
permits zero or more applications of p
,
whereas many1
requires at least one successful application:
many :: Parser a → Parser [a ]
many p = many1 p +++ return [ ]
many1 :: Parser a → Parser [a ]
many1 p = do v ← p
vs ← many p
return (v : vs)
How the recursive call happens at
vs <- many p
vs
is the result value of many p
, but many p called many1 p
, all many1
has in its definition is a do notation, and again has result value v
, and vs
, when does the recursive call return?
Why does the following snippet can return [("123","abc")]
?
> parse (many digit) "123abc"
[("123", "abc")]
For the last question:
> parse (many digit) "123abc"
[("123", "abc")]
Means that parsing has been successful as at least one result has been returned in the answer list. Hutton parsers always return a list - the empty list means parsing failure.
The result ("123", "abc") means that parsing has found three digits "123" and stopped at 'a' which is not a digit - so the "rest of the input" is "abc".
Note that many
means "as many as possibly" not "one or more". If it were "one or more" you'd get this result instead:
[("1", "23abc"), ("12", "3abc"), ("123", "abc")]
This behaviour wouldn't be very good for deterministic parsing, though it might sometimes be needed for natural language parsing.
The recursion stops at the v <- p
line. The monadic behavior of the Parser will just propagate a []
to the end of the computation when p
cannot be parsed anymore.
p >>= f = P (\inp -> case parse p inp of
[] -> [] -- this line here does not call f
[(v,out)] -> parse (f v) out)
The second function is written in do-notation, which is just a nice syntax for the following:
many1 p = p >>= (\v -> many p >>= (\vs -> return (v : vs)))
If parsing p produces an empty list []
the function \v -> many p >>= (\vs -> return (v : vs))
will not be called, stopping the recursion.
Let me strip this down to the barest bones to make absolutely clear why do
-blocks can be misunderstood if they're read simply as imperative code. Consider this snippet:
doStuff :: Maybe Int
doStuff = do
a <- Nothing
doStuff
It looks like doStuff
will recurse forever, after all, it's defined to do a sequence of things ending with doStuff
. But the sequence of lines in a do
-block is not simply a sequence of operations that is performed in order. If you're at a point in a do
-block, the way the rest of the block is handled is determined by the definition of >>=
. In my example, the second argument to >>=
is only used if the first argument isn't Nothing
. So the recursion never happens.
Something similar can happen in many different monads. Your example is just a little more complex: when there are no more ways to parse something, the stuff after the >>=
is ignored.
来源:https://stackoverflow.com/questions/6080997/unable-to-understand-a-mutual-recursion