RegEx with variable data in it - ply.lex

此生再无相见时 提交于 2019-12-07 12:35:34

问题


im using the python module ply.lex to write a lexer. I got some of my tokens specified with regular expression but now im stuck. I've a list of Keywords who should be a token. data is a list with about 1000 Keywords which should be all recognised as one sort of Keyword. This can be for example: _Function1 _UDFType2 and so on. All words in the list are separated by whitespaces thats it. I just want that lexer to recognise the words in this list, so that it would return a token of type `KEYWORD.

data = 'Keyword1 Keyword2 Keyword3 Keyword4'
def t_KEYWORD(t):
    # ... r'\$' + data ??
    return t

text = '''
Some test data


even more

$var = 2231




$[]Test this 2.31 + / &
'''

autoit = lex.lex()
autoit.input(text)
while True:
    tok = autoit.token()
    if not tok: break
    print(tok)

So i was trying to add the variable to that regex, but it didnt work. I'm always gettin: No regular expression defined for rule 't_KEYWORD'.

Thank you in advance! John


回答1:


As @DSM suggests you can use the TOKEN decorator. The regular expression to find tokens like cat or dog is 'cat|dog' (that is, words separated by '|' rather than a space). So try:

from ply.lex import TOKEN
data = data.split() #make data a list of keywords

@TOKEN('|'.join(data))
def t_KEYWORD(t):
    return t



回答2:


ply.lex uses the docstring for the regular expression. Notice the order which you define tokens defines their precedence, which this is usually important to manage.

.

The docstring at the top cannot be an expression, so you need to do this token definition by token definition.

We can test this in the interpreter:

def f():
    "this is " + "my help"  #not a docstring :(
f.func_doc #is None
f.func_doc = "this is " + "my help" #now it is!

Hence this ought to work:

def t_KEYWORD(token):
    return token
t_KEYWORD.func_doc=r'REGULAR EXPRESSION HERE' #can be an expression



回答3:


Not sure if this works with ply, but the docstring is the __doc__ attribute of a function so if you write a decorator that takes a string expression and sets that to the __doc__ attribute of the function ply might use that.



来源:https://stackoverflow.com/questions/12217816/regex-with-variable-data-in-it-ply-lex

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!