Conditional construct not working in Python regex

后端未结

关注

 2  1429

I an a newbie in python and I want to use my regex in re.sub. I tried it on regex101 and it works. Somehow when I tried to use it on my python (version 3.6) it does

相关标签:

2条回答

深忆病人

2021-01-29 06:44
The problem is that you cannot use lookarounds in conditional constructs in a Python re. Only capturing group IDs to test if the previous group matched.

(?(id/name)yes-pattern|no-pattern)
Will try to match with yes-pattern if the group with given id or name exists, and with no-pattern if it doesn’t. no-pattern is optional and can be omitted.

The (?(?=[^\t]*)([\t]+)) regex checks if there are 0+ chars other than tabs at the current location, and if yes, matches and captures 1 or more tabs. This makes no sense. If you want to match the first occurrence of 1 or more tabs, you may use re.sub with a mere "\t+" pattern and count=1 argument.
```
import re
reg = "\t+";
s = 'a          bold, italic,           teletype';
result = re.sub(reg, ',', s, count=1);
print(result);
```
See the Python demo
0 讨论(0)
发布评论:

提交评论
- 加载中...
爱一瞬间的悲伤

2021-01-29 07:04
I suppose you could do this:
```
import re

regex = r'(^\w*?[\t]+)'
s = 'a      bold, italic,           teletype'

def repl(match):
    s = match.group(0)
    return s.rstrip() + ', '

print(re.sub(regex,repl, s))
```
out
```
a, bold, italic,            teletype
```
Here we are capturing the beginning of the string through any tabs that may occur after the first word, and passing the match to a callable. The callable removes trailing tabs with rstrip and adds a trailing comma.

Note: if the first tab occurs after the first word, it's not replaced. i.e. 'a bold, italic, teletype' is left unchanged. Is that what you want?
0 讨论(0)
发布评论:

提交评论
- 加载中...