I want to build a Brainfuck (Damn that name) interpreter in my freshly created programming language to prove it\'s turing-completeness.
Now, everything is clear so far (
When you reach [
, you test the data pointer.
If it's false, you can scan for the next matched ]
character, counting up how many [
you see and making sure you mark them off as you see each ]
.
If it's true, you need to keep track of its position so you can jump back to it later. I suggest using a stack. Push the current program position onto the stack, then when you reach ]
, test the data pointer. If it's true, go to the topmost program position on the stack. If it's false, pop the position off the stack and continue.
As you nest into inner loops, the stack will cleanly record the context of each loop.
See stack (wikipedia). This is analogous to how assembly programs deal with function calls.
From the top of my head, might be some errors in it, but something like this should work:
char* interpret(char* instructions){
char* c = instructions;
while (*c) {
if (*c == ".") putchar(*p);
else if (*c == ",") *p = getchar();
else if (*c == '+') (*p)++;
else if (*c == '-') (*p)--;
else if (*c == '<') p--;
else if (*c == '>') p++;
else if (*c == '[') c = interpret(c+1);
else if (*c == ']') { if (*p) c = instructions else return c; }
c++;
}
return 0;
}
Call interpret with the brainf*ck source code. The pointer p is to the current memory position. Do a recursive call when spotting a [. Return from this recursive call when encountering a ].
I prefer to use a jump table (along with conversion of raw BF to 'bytecode'). Optimizing BF interpreters are clearly the way to go!
How do I implement the BF loops in my interpreter?
That’s the point – it entirely depends on your language. For the case of a stack-based programming language (or any language which can use a stack), @rjh has given a good solution. Other languages would use different solutions, such as recursion (i.e. implicit use of a stack).
Here is my "optimizing" version of interpreter, that pre-compiles the loop jumps.
def interpret2(code):
data = [0] * 5000 # data memory
cp = 0 # code pointer
dp = 0 # data pointer
# pre-compile a jump table
stack = []
jump = [None] * len(code)
for i,o in enumerate(code):
if o=='[':
stack.append(i)
elif o==']':
jump[i] = stack.pop()
jump[jump[i]] = i
# execute
while cp < len(code):
cmd = code[cp]
if cmd == '>': dp += 1
elif cmd == '<': dp -= 1
elif cmd == '+': data[dp] += 1
elif cmd == '-': data[dp] -= 1
elif cmd == '.': stdout.write(chr(data[dp]))
elif cmd == ',': data[dp] = ord(stdin.read(1))
elif cmd == '[' and not data[dp]: # skip loop if ==0
cp = jump[cp]
elif cmd == ']' and data[dp]: # loop back if !=0
cp = jump[cp]
cp += 1
It does a dry run of the code, keeping track of the brackets (in a stack) and marks the goto addresses in parallel jump
array which is later consulted during execution.
I compared the execution speed on long-running BF program (calculate N digits of Pi) and this increased the speed 2x compared to an innocent implementation in which source is scanned forward to exit [
and scanned backwards to loop on ]
.