I wanted to make a simple parser, for a \"pseudo code\" like language(kept rigid), in Java. A sample pseudo code would be -
//This is a comment
$x1 = readi
in simple cases writing manually a parser makes sense.
However, using StringTokenizer is a indicator of doing it wrong, because a StringTokenizer IS already a SIMPLE parser.
a parser usually reads a char and changes its state depending on the value of that char.
Just a simple parser, a "b" make following char "uppercase", e to lowercase. "." stops
String input = "aDDbcDDeaaef.";
int pos = 0;
int state = 0;
while (pos < input.length()) {
char z = input.charAt (pos);
if (z == '.') break;
switch (z) {
case 'b': state = 1; break;
case 'e': state = 0; break;
default:
if (state == 0) {
System.out.print(Char.toLowerCase(z));
} else {
System.out.print(Char.toUpperCase(z));
}
}
pos ++;
}
Although I think that it's great that you want to build a parser for a language like this, doing so is much harder than it looks. Parsing is a very well-studied problem and there are many excellent algorithms that you can use, but they are extremely difficult to implement by hand. While you can use tricks like conversions to RPN for smaller examples like parsing expressions, building up a full programming language requires a much more complex set of tricks.
To parse a language of this complexity, you are probably best off using a parser generator rather than trying to write your own by hand. ANTLR and Java CUP are two well-known tools for doing precisely what you're interested in accomplishing, and I would strongly suggest using one of the two of them.
Hope this helps!
For simple languages (this is a judgement call, and if you are inexperienced you may not be able to make that call correctly), one can often write a recursive descent parser by hand that does well enough. The good news is that coding a recursive descent parser is pretty straightforward.
If you aren't sure, use overkill in the form of the strongest parser generator you can get.