How to write a regular expression to match a string literal where the escape is a doubling of the quote character?

北城余情 提交于 2019-12-14 00:22:06

问题


I am writing a parser using ply that needs to identify FORTRAN string literals. These are quoted with single quotes with the escape character being doubled single quotes. i.e.

'I don''t understand what you mean'

is a valid escaped FORTRAN string.

Ply takes input in regular expression. My attempt so far does not work and I don't understand why.

t_STRING_LITERAL = r"'[^('')]*'"

Any ideas?


回答1:


A string literal is:

  1. An open single-quote, followed by:
  2. Any number of doubled-single-quotes and non-single-quotes, then
  3. A close single quote.

Thus, our regex is:

r"'(''|[^'])*'"



回答2:


You want something like this:

r"'([^']|'')*'"

This says that inside of the single quotes you can have either double quotes or a non-quote character.

The brackets define a character class, in which you list the characters that may or may not match. It doesn't allow anything more complicated than that, so trying to use parentheses and match a multiple-character sequence ('') doesn't work. Instead your [^('')] character class is equivalent to [^'()], i.e. it matches anything that's not a single quote or a left or right parenthesis.




回答3:


It's usually easy to get something quick-and-dirty for parsing particular string literals that are giving you problems, but for a general solution you can get a very powerful and complete regex for string literals from the pyparsing module:

>>> import pyparsing
>>> pyparsing.quotedString.reString
'(?:"(?:[^"\\n\\r\\\\]|(?:"")|(?:\\\\x[0-9a-fA-F]+)|(?:\\\\.))*")|(?:\'(?:[^\'\\n\\r\\\\]|(?:\'\')|(?:\\\\x[0-9a-fA-F]+)|(?:\\\\.))*\')'

I'm not sure about significant differences between FORTRAN's string literals and Python's, but it's a handy reference if nothing else.




回答4:


import re

ch ="'I don''t understand what you mean' and you' ?"

print re.search("'.*?'",ch).group()
print re.search("'.*?(?<!')'(?!')",ch).group()

result

'I don'
'I don''t understand what you mean'


来源:https://stackoverflow.com/questions/2143235/how-to-write-a-regular-expression-to-match-a-string-literal-where-the-escape-is

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!