I\'m working on a small parser using Megaparsec and trying to parse arithmetic.
-- Arithmetic expressions
data Aexp = N Num
| V Var
| M
The types are lying to you: when you define a recursive parser p
, you're not actually allowed to use p
itself wherever you want! You need to munch part of the input first in order to guarantee that you are making progress. Otherwise Haskell will indeed happily go into an infinite loop.
This problem typically gets solved by defining different "tiers" of expressions and only allowing either "simpler" ones or parentheses-wrapped "more complex" ones in left recursive positions (because matching an open parentheses does force you to make your way through part of the input string).
E.g. the grammar for your expressions would be turned into (from simplest to most complex):
::= [0-9]+
::= [a-zA-Z]+
::= '(' ')' | |
::= | '*'
::= | '+' | '-'
This is where a total language shines: because the types have to be completely honest when it comes to termination, it becomes literally impossible to write these badly behaved left recursive parsers. The typechecker tells you that you have to find another way of recognizing the terms of your language.
For instance the fixpoint combinator fix I use in my total parser combinators library doesn't have type (a -> a) -> a
but rather (ignoring the funny brackets) (□ a → a) → a
which precisely prevents you from using the recursive call before you've made some progress. You can still write a parser for Expr but the typechecker is here to warn you when you're making an illegal move.