There seems to be a consensus that you should use Parsec as an applicative rather than a monad. What are the benefits of applicative parsing over monadic parsing?
The main reason I can see to prefer applicative parsers over monadic parsers is the same as the main reason to prefer applicative code over monadic code in any context: being less powerful, applicatives are simpler to use.
This is an instance of a more general engineering dictum: use the simplest tool which gets the job done. Don't use a fork lift when a dolly will do. Don't use a table saw to cut out coupons. Don't write code in IO
when it could be pure. Keep it simple.
But sometimes, you need the extra power of Monad
. A sure sign of this is when you need to change the course of the computation based on what has been computed so far. In parsing terms, this means determining how to parse what comes next based on what has been parsed so far; in other words you can construct context-sensitive grammars this way.
The main difference between monadic and applicative parsing is in how sequential composition is handled. In the case of an applicative parser, we use (<*>)
, whereas with a monad we use (>>=)
.
(<*>) :: Parser (a -> b) -> Parser a -> Parser b
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
The monadic approach is more flexible, because it allows the grammar of the second part to depend on the result from the first one, but we rarely need this extra flexibility in practice.
You might think that having some extra flexibility can't hurt, but in reality it can. It prevents us from doing useful static analysis on a parser without running it. For example, let's say we want to know whether a parser can match the empty string or not, and what the possible first characters can be in a match. We want functions
empty :: Parser a -> Bool
first :: Parser a -> Set Char
With an applicative parser, we can easily answer this question. (I'm cheating a little here. Imagine we have a data constructors corresponding to (<*>)
and (>>=)
in our candidate parser "languages").
empty (f <*> x) = empty f && empty x
first (f <*> x) | empty f = first f `union` first x
| otherwise = first f
However, with a monadic parser we don't know what the grammar of the second part is without knowing the input.
empty (x >>= f) = empty x && empty (f ???)
first (x >>= f) | empty x = first x `union` first (f ???)
| otherwise = first x
By allowing more, we're able to reason less. This is similar to the choice between dynamic and static type systems.
But what is the point of this? What might we use this extra static knowledge for? Well, we can for example use it to avoid backtracking in LL(1) parsing by comparing the next character to the first
set of each alternative. We can also determine statically whether this would be ambiguous by checking if the first
sets of two alternatives overlap.
Another example is that it can be used for error recovery, as shown in the paper Deterministic, Error-Correcting Combinator Parsers by S. Doaitse Swierstra and Luc Duponcheel.
Usually, however, the choice between applicative and monadic parsing has already been made by the authors of the parsing library you're using. When a library such as Parsec exposes both interfaces, the choice of which one to use is purely a stylistic one. In some cases applicative code is easier to read than monadic code and sometimes it's the other way round.
If a parser is purely applicative, it is possible to analyse its structure and "optimise" it before running it. If a parser is monadic, it's basically a Turing-complete program, and performing almost any interesting analysis of it is equivalent to solving the halting problem (i.e., impossible).
Oh, and yes, there's a stylistic difference too...
Monads are strictly a more featureful abstraction than Applicatives. You could write
instance (Monad m) => Applicative m where
pure = return
(<*>) = ap
But there is no way to write
instance (Applicative a) => Monad a where
return = pure
(>>=) = ???
So yes, it is essentially a matter of style. I imagine if you use Because Applicative is a strictly smaller interface than Monad, this means that return
and ap
, then there should be no performance loss over using pure
and <*>
.<*>
can sometimes be more highly optimized than ap
. (But with clever GHC rewrite rules, one can often achieve the same optimizations regardless.)
Is monadic parsing out?
Since Monads are a subset of Applicatives, I would conclude that applicative parsing is a subset of monadic parsing.
With Parsec the benefit of using Applicative is just style. Monad has the advantage that it is more powerful - you can implement context sensitive parsers.
Doaitse Swierstra's UU parsing is more efficient if used only applicatively.