The expression you are trying to parse into an abstract syntax tree is a context-free expression. This means that you need a context-free grammar to be able to parse it. So let's create a parser.
To simplify the parsing we'll separate the lexical analysis phase. Hence the first thing we need is to create a lexer. Luckily there are a lot of handy lexer libraries available. We'll use the this one:
https://github.com/aaditmshah/lexer
So here's the lexical analyzer:
var lexer = new Lexer;
lexer.addRule(/\s+/, function () {
/* skip whitespace */
});
lexer.addRule(/[a-z]/, function (lexeme) {
return lexeme; // symbols
});
lexer.addRule(/[\(\+\-\*\/\)]/, function (lexeme) {
return lexeme; // punctuation (i.e. "(", "+", "-", "*", "/", ")")
});
Next we create a parser. We'll use the following implementation of Dijkstra's shunting yard algorithm for parsing:
https://gist.github.com/aaditmshah/6683499
So here's the parser:
var factor = {
precedence: 2,
associativity: "left"
};
var term = {
precedence: 1,
associativity: "left"
};
var parser = new Parser({
"+": term,
"-": term,
"*": factor,
"/": factor
});
Finally we create a parse
function as follows:
function parse(input) {
lexer.setInput(input);
var tokens = [], token;
while (token = lexer.lex()) tokens.push(token);
return parser.parse(tokens);
}
Now you simply call parse
to get a parsed stream of tokens in postfix notation:
var output = parse("e*((a*(b+c))+d)");
alert(output.join(" ")); // "e a b c + * d + *"
The advantage of postfix form is that you can easily manipulate it using a stack:
- Push
e
onto the stack.
- Push
a
onto the stack.
- Push
b
onto the stack.
- Push
c
onto the stack.
- Pop
b
and c
and push b + c
onto the stack.
- Pop
a
and b + c
and push a * (b + c)
onto the stack.
- Push
d
onto the stack.
- Pop
a * (b + c)
and d
and push a * (b + c) + d
onto the stack.
- Pop
e
and a * (b + c) + d
and push e * (a * (b + c) + d)
onto the stack.
Similarly it's easy to create the output you want using stacks too. It the same steps. You only push different values back onto the stack for different operations.
See the demo: http://jsfiddle.net/d2UYZ/2/
Edit 1: I was so bored that I solved the problem for you:
var stack = [];
var operator = {
"+": "add",
"-": "subtract",
"*": "multiply",
"/": "divide"
};
parse("e*((a*(b+c))+d)").forEach(function (c) {
switch (c) {
case "+":
case "-":
case "*":
case "/":
var b = stack.pop();
var a = stack.pop();
stack.push(operator[c] + "(" + a + ", " + b + ")");
break;
default:
stack.push(c);
}
});
var output = stack.pop();
alert(output);
The output is (as you expect) the string "multiply(e, add(multiply(a, add(b,c)), d))"
. See the demo: http://jsfiddle.net/d2UYZ/4/
Edit 2: If you need to evaluate the expression you could do that easily too. All you need is a context mapping symbols to values and functions for each operator:
var stack = [];
var context = {
"a": 1,
"b": 2,
"c": 3,
"d": 4,
"e": 5
};
var operator = {
"+": function (a, b) { return a + b; },
"-": function (a, b) { return a - b; },
"*": function (a, b) { return a * b; },
"/": function (a, b) { return a / b; }
};
parse("e*((a*(b+c))+d)").forEach(function (c) {
switch (c) {
case "+":
case "-":
case "*":
case "/":
var b =+ stack.pop();
var a =+ stack.pop();
stack.push(operator[c](a, b));
break;
default:
stack.push(context[c]);
}
});
var output = stack.pop();
Thus the expression e*((a*(b+c))+d)
becomes 5*((1*(2+3))+4)
which evaluates to 45
. See the demo: http://jsfiddle.net/d2UYZ/6/