Hi I\'m working on a small bison to learn how it works. The bison is supposed to parse a sentence. The sentence is made of expressions and expressions are made of words.
Usually, the point of writing a parser is so that you end up with a data structure that represents the input. You then transform the structure in some way, or, in your case, just print it out.
At each expression production, you want to construct a node in that structure that represents what you have recognized so far.
I'm a little rusty, but it would be something like this:
query: /* empty */
| query expression { printNode($2); /* printf()s are in here */ }
;
expression: term { $$ = makeTermNode($1); }
| expression OR term { $$ = makeOrNode($1, $3); }
| expression AND term { $$ = makeAndNode($1, $3); }
;
The data structure to hold your nodes:
struct Node {
int nodeType; /* WORD or operator token like AND, OR */
node* leftOperand;
node* rightOperand; /* will be null if the node is a term */
}
%union
{
int number;
char *string;
Node *node;
}
Update:
It's been a while since I coded in C, so I will have to resort to pseudocode. There is no code here to reclaim memory once we're done with it. Apologies for any other blunders.
struct Node *makeTermNode(int word) {
Node *node = malloc(sizeof struct Node);
node->nodeType = word;
node->rightOperand = null;
node->leftOperand = null;
return node;
}
Notice that your WORD token just denotes that a string of letters of some sort was scanned; the specific sequence of letters is discarded. (If you want to know the sequence, have your lexer return a copy of yytext instead of the WORD token.)
struct Node *makeAndNode(struct Node* leftOperand, struct Node *rightOperand) {
Node *node = malloc(sizeof struct Node);
node->nodeType = AND;
node->leftOperand = leftOperand;
node->rightOperand = rightOperand;
return node;
}
And likewise for makeOrNode(). Alternatively, you could write just makeNodeWithOperator(int operator, struct Node* leftOperand, struct Node *rightOperand) to handle the "and" and "or" cases.
I changed printAllNodes() to printNode(). It starts at the root of the expression tree structure we have built, recursively visiting the left side of each subexpression first, then the right. It goes something like this:
void printNode (struct Node* node) {
switch (node->nodeType) {
case WORD:
printf("%i", node->nodeType);
return;
case AND:
case OR:
printf("(");
printNode(node->leftOperand);
printf("%i", node->nodeType);
printfNode(node->rightOperand);
printf(")");
return;
}
}