I don\'t want to use string split because I have numbers 1-99, and a column of string that contain \'#/#\' somewhere in the text.
How can I write a regex to extract the
Use a lookahead to match on the /
, like this:
\d+(?=/)
You may need to escape the / if your implementation uses it as its delimiter.
Live example: https://regex101.com/r/xdT4vq/1
import re
myString = "He got 10/19 questions right."
oldnumber = re.findall('[0-9]+/', myString) #find one or more digits followed by a slash.
newNumber = oldnumber[0].replace("/","") #get rid of the slash.
print(newNumber)
>>>10
res = re.search('(\d+)/\d+', r'He got 10/19 questions right.')
res.groups()
('10',)
You can still use str.split()
if you carefully construct logic around it:
t = "He got 10/19 questions right."
t2 = "He/she got 10/19 questions right"
for q in [t,t2]:
# split whole string at spaces
# split each part at /
# only keep parts that contain / but not at 1st position and only consists
# out of numbers elsewise
numbers = [x.split("/") for x in q.split()
if "/" in x and all(c in "0123456789/" for c in x)
and not x.startswith("/")]
if numbers:
print(numbers[0][0])
Output:
10
10
Find all numbers before the forward-slash and exclude the forward-slash by using start-stop parentheses.
>>> import re
>>> myString = 'He got 10/19 questions right.'
>>> stringNumber = re.findall('([0-9]+)/', myString)
>>> stringNumber
['10']
This returns all numbers ended with a forward-slash, but in a list of strings. if you want integers, you should map
your list with int
, then make a list
again.
>>> intNumber = list(map(int, stringNumber))
>>> intNumber
[10]