Imagine the following types of strings:
if ((a1 and b) or (a2 and c)) or (c and d) or (e and f)
Now, I'd like to get the expressions in parentheses, so I wrote a PEG
parser with the following grammar:
from parsimonious.grammar import Grammar
grammar = Grammar(
r"""
program = if expr+
expr = term (operator term)*
term = (factor operator factor) / factor
factor = (lpar word operator word rpar) / (lpar expr rpar)
if = "if" ws
and = "and"
or = "or"
operator = ws? (and / or) ws?
word = ~"\w+"
lpar = "("
rpar = ")"
ws = ~"\s*"
""")
which parses just fine with
tree = grammar.parse(string)
Now the question arises: how to write a NodeVisitor
class for this tree to get only the factors? My problem here is the second branch which can be deeply nested.
I tried with
def walk(node, level = 0):
if node.expr.name == "factor":
print(level * "-", node.text)
for child in node.children:
walk(child, level + 1)
walk(tree)
but to no avail, really (factors bubble up in duplicates).
Note: This question is based on another one on StackOverflow.
How would I go about it to get ((a1 and b) or (a2 and c)), (c and d) and (e and f) as three parts?
You could create a visitor that "listens" when a node in the parse tree is a (
, in which a depth-variable is increased, and when a )
is encountered, the depth-variable is decreased. Then in the method that is called that matches a parenthesised expression, you inspect the depth before adding it to your list of expressions to return from the visitor.
Here a is a quick example:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
grammar = Grammar(
r"""
program = if expr+
expr = term (operator term)*
term = (lpar expr rpar) / word
if = "if" ws
and = "and"
or = "or"
operator = ws? (and / or) ws?
word = ~"\w+"
lpar = "("
rpar = ")"
ws = ~"\s*"
""")
class ParExprVisitor(NodeVisitor):
def __init__(self):
self.depth = 0
self.par_expr = []
def visit_term(self, node, visited_children):
if self.depth == 0:
self.par_expr.append(node.text)
def visit_lpar(self, node, visited_children):
self.depth += 1
def visit_rpar(self, node, visited_children):
self.depth -= 1
def generic_visit(self, node, visited_children):
return self.par_expr
tree = grammar.parse("if ((a1 and b) or (a2 and c)) or (c and d) or (e and f)")
visitor = ParExprVisitor()
for expr in visitor.visit(tree):
print(expr)
which prints:
((a1 and b) or (a2 and c))
(c and d)
(e and f)
If you only want to return each outermost factor, return
early and do not descend into its children.
def walk(node, level = 0):
if node.expr.name == "factor":
print(level * "-", node.text)
return
for child in node.children:
walk(child, level + 1)
Output:
----- ((a1 and b) or (a2 and c))
----- (c and d)
------ (e and f)
来源:https://stackoverflow.com/questions/55405055/nodevisitor-class-for-peg-parser-in-python