I\'m going to implement a tokenizer in Python and I was wondering if you could offer some style advice?
I\'ve implemented a tokenizer before in C and in Java so I\'m
"Is there a better alternative to just simply returning a list of tuples"
I had to implement a tokenizer, but it required a more complex approach than a list of tuples, therefore I implemented a class for each token. You can then return a list of class instances, or if you want to save resources, you can return something implementing the iterator interface and generate the next token while you progress in the parsing.