问题
I'm trying to parse SQL search conditions and having trouble getting the parser to differentiate logical (AND
, OR
) from other infix operators. I'm parsing them as different nodes (perhaps that's difficult to do), but simplifies the evaluation phase. Here's the relevant code snippet (I can include more if necessary).
let opp = OperatorPrecedenceParser<_,_,_>()
let scalarExpr = opp.ExpressionParser
opp.TermParser <- constant <|> id <|> between lparen rparen scalarExpr <|> scalarExpr
//infix operators added here
let comparison = //(e.g., 1 < 2)
let compareExpr = pipe3 scalarExpr compareOp scalarExpr (fun l op r -> Comparison(op, l, r))
between lparen rparen compareExpr <|> compareExpr
let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws
let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef :=
[ comparison
pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
between lparen rparen searchCondition ]
|> choice
let filter : Parser<_,unit> = ws >>. searchCondition .>> eof
"1 = 1"
correctly parses to Comparison (Eq,Constant (Int32 1),Constant (Int32 1))
but once I try to join two comparisons with a logical operator, e.g., "1 = 1 or 2 = 2"
, it fails to parse with
Error in Ln: 1 Col: 7
1 = 1 or 2 = 2
^
Expecting: end of input or infix operator
: 7
I expected it to parse the 1
before the error as a scalar expression and upon hitting or
backtrack, realizing it's not an infix operator, return 1
as the complete scalar, and recognize it's parsing the left-hand side of a condition joined by logical operator or
.
Instead, it seems to continue assuming 1
begins a more complex scalar expression, possibly involving an infix operator.
Is there a problem with the code, or is the solution to parse AND
/OR
as infix operators (using the same OperatorPrecedenceParser
)? I'd rather not go that route, so I'm hoping I've made a simple mistake somewhere.
The complete code is on gist.
回答1:
I think ultimately you'll find you need to treat and
and or
as infix operators with precedence rules because that is exactly what they are and is the reason why most parsers including fparsec and fsyacc have special features to handle them (i.e. resolve ambiguity through precedence and associativity rules).
You've found one case highlighting this, but consider another:
1 = 1 or 2 = 2 and 3 =3
should that parse as (1 = 1 or 2 = 2) and 3 = 3
or 1 = 1 or (2 = 2 and 3 = 3)
?
回答2:
Your parser stops after the first equation, because the choice
combinator of the searchCondition
applies the first argument parser comparison
to the input and upon success simply returns the result of the argument parser. You then get an error because filter
fails to parse the eof
after the searchCondition
.
The choice
and <|>
combinators do not implement a longest match rule and they do not backtrack after an error, as explained in the tutorial. So your searchCondition
parser can't work.
Another problem is that your searchCondition
parser is left-recursive, since the second and third choice
arguments will try to apply searchCondition
again without previously consuming any input. Left-recursion will lead to a stack overflow.
Similary, having <|> scalarExpr
at the end of the opp.TermParser
definition is unnecessary and can lead to infinite recursions.
When you translate a left-recursive parser grammar to FParsec, you need to eliminate the left-recursion.
One way to fix the searchCondition
parser is to factor out the left-hand-side expression:
let andTerm = stringCIReturn "and" (fun l r -> And(l, r)) .>> ws
let orTerm = stringCIReturn "or" (fun l r -> Or(l, r)) .>> ws
let searchCondition, searchConditionRef = createParserForwardedToRef()
do searchConditionRef:=
let comparisonTerm =
comparison <|> between lparen rparen searchCondition
pipe2 comparisonTerm (many ((andTerm <|> orTerm) .>>. comparisonTerm))
(fun l opRList ->
List.fold (fun l (op, r) -> op l r) l opRList)
Or even simpler:
do searchConditionRef:=
chainl1 (comparison <|> between lparen rparen searchCondition)
(andTerm <|> orTerm)
Update: In the grammar there's also a problem with the parens parsing, see the comments below.
回答3:
Creating a separate OperatorPrecedenceParser
for logical operators seems to have fixed it.
I replaced
let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws
let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef :=
[ comparison
pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
between lparen rparen searchCondition ]
|> choice
with
let condOpp = OperatorPrecedenceParser()
let searchCondition = condOpp.ExpressionParser
condOpp.TermParser <- (attempt comparison) <|> between lparen rparen searchCondition <|> searchCondition
condOpp.AddOperator(InfixOperator("or", ws, 1, Assoc.Left, fun l r -> Or(l, r)))
condOpp.AddOperator(InfixOperator("and", ws, 2, Assoc.Left, fun l r -> And(l, r)))
(1 = 1 or 2 = 2) and 3 = 3
and 1 = 1 or (2 = 2 and 3 = 3)
parse correctly.
来源:https://stackoverflow.com/questions/9215975/differentiating-logical-from-other-infix-operators