问题
Traditionally, arithmetic operators are considered to be binary (left or right associative), thus most tools are dealing only with binary operators.
Is there an easy way to parse arithmetic operators with Parsec, which can have an arbitrary number of arguments?
For example, the following expression should be parsed into the tree
(a + b) + c + d * e + f
回答1:
Yes! The key is to first solve a simpler problem, which is to model +
and *
as tree nodes with only two children. To add four things, we'll just use +
three times.
This is a great problem to solve since there's a Text.Parsec.Expr
module for just this problem. Your example is actually parseable by the example code in the documentation. I've slightly simplified it here:
module Lib where
import Text.Parsec
import Text.Parsec.Language
import qualified Text.Parsec.Expr as Expr
import qualified Text.Parsec.Token as Tokens
data Expr =
Identifier String
| Multiply Expr Expr
| Add Expr Expr
instance Show Expr where
show (Identifier s) = s
show (Multiply l r) = "(* " ++ (show l) ++ " " ++ (show r) ++ ")"
show (Add l r) = "(+ " ++ (show l) ++ " " ++ (show r) ++ ")"
-- Some sane parser combinators that we can plagiarize from the Haskell parser.
parens = Tokens.parens haskell
identifier = Tokens.identifier haskell
reserved = Tokens.reservedOp haskell
-- Infix parser.
infix_ operator func =
Expr.Infix (reserved operator >> return func) Expr.AssocLeft
parser =
Expr.buildExpressionParser table term <?> "expression"
where
table = [[infix_ "*" Multiply], [infix_ "+" Add]]
term =
parens parser
<|> (Identifier <$> identifier)
<?> "term"
Running this in GHCi:
λ> runParser parser () "" "(a + b) + c + d * e + f"
Right (+ (+ (+ (+ a b) c) (* d e)) f)
There are lots of ways of converting this tree to the desired form. Here's a hacky gross slow one:
data Expr' =
Identifier' String
| Add' [Expr']
| Multiply' [Expr']
deriving (Show)
collect :: Expr -> (Expr -> Bool) -> [Expr]
collect e f | (f e == False) = [e]
collect e@(Add l r) f =
collect l f ++ collect r f
collect e@(Multiply l r) f =
collect l f ++ collect r f
isAdd :: Expr -> Bool
isAdd (Add _ _) = True
isAdd _ = False
isMultiply :: Expr -> Bool
isMultiply (Multiply _ _) = True
isMultiply _ = False
optimize :: Expr -> Expr'
optimize (Identifier s) = Identifier' s
optimize e@(Add _ _) = Add' (map optimize (collect e isAdd))
optimize e@(Multiply _ _) = Multiply' (map optimize (collect e isMultiply))
I will note, however, that almost always Expr
is Good Enough™ for the purposes of a parser or compiler.
来源:https://stackoverflow.com/questions/33987104/parsing-non-binary-operators-with-parsec