I\'m developing a small python like language using flex, byacc (for lexical and parsing) and C++, but i have a few questions regarding scope control.
just as python it u
I am currently implementing a programming language rather similar to this (including the multilevel break oddly enough). My solution was to have the tokenizer emit indent and dedent tokens based on indentation. Eg:
while 1: # colons help :)
print('foo')
break 1
becomes:
["while", "1", ":",
indent,
"print", "(", "'foo'", ")",
"break", "1",
dedent]
It makes the tokenizer's handling of '\n' somewhat complicated though. Also, i wrote the tokenizer and parser from scratch, so i'm not sure whether this is feasable in lex and yacc.
Semi-working pseudocode example:
level = 0
levels = []
for c = getc():
if c=='\n':
emit('\n')
n = 0
while (c=getc())==' ':
n += 1
if n > level:
emit(indent)
push(levels,n)
while n < level:
emit(dedent)
level = pop(levels)
if level < n:
error tokenize
# fall through
emit(c) #lazy example