Is it possible to implement in Python something like this simple one:
#!/usr/bin/perl
my $a = \'Use HELLO1 code\';
if($a =~ /(?i:use)\\s+([A-Z0-9]+)\\s+(?i:c
According to the docs, this is not possible. The (?x)
syntax only allows you to modify a flag for the whole expression. Therefore, you must split this into three regexp and apply them one after the other or do the "ignore case" manually: /[uU][sS][eE]...
Since python 3.6 you can use flag inside groups :
(?imsx-imsx:...)
(Zero or more letters from the set 'i', 'm', 's', 'x', optionally followed by '-' followed by one or more letters from the same set.) The letters set or removes the corresponding flags: re.I (ignore case), re.M (multi-line), re.S (dot matches all), and re.X (verbose), for the part of the expression.
Thus (?i:use)
is now a correct syntaxe. From a python3.6 terminal:
>>> import re
>>> regex = re.compile('(?i:use)\s+([A-Z0-9]+)\s+(?i:code)')
>>> regex.match('Use HELLO1 code')
<_sre.SRE_Match object; span=(0, 15), match='Use HELLO1 code'>
>>> regex.match('use HELLO1 Code')
<_sre.SRE_Match object; span=(0, 15), match='use HELLO1 Code'>
As far as I could find, the python regular expression engine does not support partial ignore-case. Here is a solution using a case-insensitive regular expression, which then tests if the token is uppercase afterward.
#! /usr/bin/env python
import re
token_re = re.compile(r'use\s+([a-z0-9]+)\s+code', re.IGNORECASE)
def find_token(s):
m = token_re.search(s)
if m is not None:
token = m.group(1)
if token.isupper():
return token
if __name__ == '__main__':
for s in ['Use HELLO1 code',
'USE hello1 CODE',
'this does not match',
]:
print s, '->',
print find_token(s)
Here is the program's output:
Use HELLO1 code -> HELLO1
USE hello1 CODE -> None
this does not match -> None