I am trying to use this regular expression to remove all instances of square brackets (and everything in them) from strings. For example, this works when there is only one p
By default *
(or +
) matches greedily, so the pattern given in the question will match upto the last ]
.
>>> re.findall(r'\[[^()]*\]', "Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]")
['[a] donkey lying down among the sheep pens.[b]']
By appending ?
after the repetition operator (*
), you can make it match non-greedy way.
>>> import re
>>> pattern = r'\[.*?\]'
>>> s = """Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]"""
>>> re.sub(pattern, '', s)
'Issachar is a rawboned donkey lying down among the sheep pens.'
Try:
import re
pattern = r'\[[^\]]*\]'
s = """Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]"""
t = re.sub(pattern, '', s)
print t
Output:
Issachar is a rawboned donkey lying down among the sheep pens.
For Numbers inside the brackets (No Alphabets), e.g. [89], [23], [11], etc., this is the pattern to use.
import re
text = "The[TEXT] rain in[33] Spain[TEXT] falls[12] mainly in[23] the plain![45]"
pattern = "\[\d*?\]"
numBrackets = re.findall(pattern, text)
print(numBrackets)
Output:
['[33]', '[12]', '[23]', '[45]']