问题
im using the python module ply.lex
to write a lexer. I got some of my tokens specified with regular expression but now im stuck. I've a list of Keywords
who should be a token
. data
is a list with about 1000 Keywords which should be all recognised as one sort of Keyword. This can be for example: _Function1 _UDFType2
and so on. All words in the list are separated by whitespaces thats it. I just want that lexer to recognise the words in this list, so that it would return a token of type `KEYWORD.
data = 'Keyword1 Keyword2 Keyword3 Keyword4'
def t_KEYWORD(t):
# ... r'\$' + data ??
return t
text = '''
Some test data
even more
$var = 2231
$[]Test this 2.31 + / &
'''
autoit = lex.lex()
autoit.input(text)
while True:
tok = autoit.token()
if not tok: break
print(tok)
So i was trying to add the variable to that regex, but it didnt work. I'm always gettin:
No regular expression defined for rule 't_KEYWORD'
.
Thank you in advance! John
回答1:
As @DSM suggests you can use the TOKEN decorator. The regular expression to find tokens like cat
or dog
is 'cat|dog'
(that is, words separated by '|'
rather than a space). So try:
from ply.lex import TOKEN
data = data.split() #make data a list of keywords
@TOKEN('|'.join(data))
def t_KEYWORD(t):
return t
回答2:
ply.lex
uses the docstring for the regular expression. Notice the order which you define tokens defines their precedence, which this is usually important to manage.
.
The docstring at the top cannot be an expression, so you need to do this token definition by token definition.
We can test this in the interpreter:
def f():
"this is " + "my help" #not a docstring :(
f.func_doc #is None
f.func_doc = "this is " + "my help" #now it is!
Hence this ought to work:
def t_KEYWORD(token):
return token
t_KEYWORD.func_doc=r'REGULAR EXPRESSION HERE' #can be an expression
回答3:
Not sure if this works with ply, but the docstring is the __doc__
attribute of a function so if you write a decorator that takes a string expression and sets that to the __doc__
attribute of the function ply might use that.
来源:https://stackoverflow.com/questions/12217816/regex-with-variable-data-in-it-ply-lex