I want to find words that appear after a keyword (specified and searched by me) and print out the result. I know that i am suppose to use regex to do it, and i tried it out
Instead of using regexes you could just (for example) separate your string with str.partition(separator) like this:
mystring = "hi my name is ryan, and i am new to python and would like to learn more"
keyword = 'name'
before_keyword, keyword, after_keyword = mystring.partition(keyword)
>>> before_keyword
'hi my '
>>> keyword
'name'
>>> after_keyword
' is ryan, and i am new to python and would like to learn more'
You have to deal with the needless whitespaces separately, though.
This will work out for u : work name\s\w+\s(\w+)
>>> s = 'hi my name is ryan, and i am new to python and would like to learn more'
>>> m = re.search('name\s\w+\s(\w+)',s)
>>> m.group(0)
'name is ryan'
>>>> m.group(1)
'ryan'
Your example will not work, but as I understand the idea:
regexp = re.compile("name(.*)$")
print regexp.search(s).group(1)
# prints " is ryan, and i am new to python and would like to learn more"
This will print all after "name" and till end of the line.
import re
s = "hi my name is ryan, and i am new to python and would like to learn more"
m = re.search("^name: (\w+)", s)
print m.group(1)
Without using regex, you can
strip punctuation (consider making everything single case, including search term)
split your text into individual words
find index of searched word
get word from array (index + 1
for word after, index - 1
for word before )
Code snippet:
import string
s = 'hi my name is ryan, and i am new to python and would like to learn more'
t = 'name'
i = s.translate(string.maketrans("",""), string.punctuation).split().index(t)
print s.split()[i+1]
>> is
For multiple occurences, you need to save multiple indices:
import string
s = 'hi my NAME is ryan, and i am new to NAME python and would like to learn more'
t = 'NAME'
il = [i for i, x in enumerate(s.translate(string.maketrans("",""), string.punctuation).split()) if x == t]
print [s.split()[x+1] for x in il]
>> ['is', 'python']
What you have used regarding your output:
re.search("name (\w+)", s)
What you have to use (match all):
re.search("name (.*)", s)