matching any character including newlines in a Python regex subexpression, not globally

前端未结

关注

 3  402

I want to use re.MULTILINE but NOT re.DOTALL, so that I can have a regex that includes both an \"any character\" wildcard and the normal . wild

相关标签:

3条回答

生来不讨喜

2020-12-01 16:00
```
[^]
```
In regex, brackets contains a list and/or range of possible values for one matching character. If that list is empty, I mean [], any character of string can't match it.

Now, the caret in front of that list and/or range, negates those permitted values. So, in front of an empty list, any character (including newline) will match it.
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦谈多话

2020-12-01 16:06
Match any character (including new line):

Regular Expression: (Note the use of space ' ' is also there)
```
[\S\n\t\v ]
```
Example:
```
import re

text = 'abc def ###A quick brown fox.\nIt jumps over the lazy dog### ghi jkl'
# We want to extract "A quick brown fox.\nIt jumps over the lazy dog"
matches = re.findall('###[\S\n ]+###', text)
print(matches[0])
```
The 'matches[0]' will contain:
'A quick brown fox.\nIt jumps over the lazy dog'

Description of '\S' Python docs:

\S Matches any character which is not a whitespace character.

( See: https://docs.python.org/3/library/re.html#regular-expression-syntax )
0 讨论(0)
发布评论:

提交评论
- 加载中...
不思量自难忘°

2020-12-01 16:09
To match a newline, or "any symbol" without re.S/re.DOTALL, you may use any of the following:
```
[\s\S]
[\w\W]
[\d\D]
```
The main idea is that the opposite shorthand classes inside a character class match any symbol there is in the input string.

Comparing it to (.|\s) and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a * or + quantifier). Compare the small example: it takes (?:.|\n)+ 45 steps to complete, and it takes [\s\S]+ just 2 steps.
0 讨论(0)
发布评论:

提交评论
- 加载中...

matching any character including newlines in a Python regex subexpression, not globally

Match any character (including new line):

Example:

Description of '\S' Python docs: