I want to use re.search to extract the first set of non-whitespace characters. I have the following pseudoscript that recreates my problem:
#!/usr/b
The [^\S]
is a negated character class that is equal to \s
(whitespace pattern). The *?
is a lazy quantifier that matches zero or more characters, but as few as possible, and when used at the end of the pattern never actually matches any characters.
Replace you m = re.search('^[^\S]*?',line)
line with
m = re.match(r'\S+',line)
or - if you want to also allow an empty string match:
m = re.match(r'\S*',line)
The re.match
method anchors the pattern at the start of the string. With re.search
, you need to keep the ^
anchor at the start of the pattern:
m = re.search(r'^\S+',line)
See the Python demo:
import re
line = "STARC-1.1.1.5 ConsCase WARNING Warning"
m = re.search('^\S+',line)
if m:
print m.group(0)
# => STARC-1.1.1.5
However, here, in this case, you may just use a mere split()
:
res = line.split()
print(res[0])
See another Python demo.