I\'d like to remove all characters before a designated character or set of characters (for example):
intro = \"<>I\'m Tom.\"
Now I\'d
import re
date_div = "Blah blah\nblah, Updated: Aug. 23, 2012 Blah blah Updated: Feb. 13, 2019"
up_to_word = ":"
rx_to_first = r'^.*?{}'.format(re.escape(up_to_word))
rx_to_last = r'^.*{}'.format(re.escape(up_to_word))
# (Dot.) In the default mode, this matches any character except a newline.
# If the DOTALL flag has been specified, this matches any character including a newline.
print("Remove all up to the first occurrence of the word including it:")
print(re.sub(rx_to_first, '', date_div, flags=re.DOTALL).strip())
print("Remove all up to the last occurrence of the word including it:")
print(re.sub(rx_to_last, '', date_div, flags=re.DOTALL).strip())
If you know the character position of where to start deleting, you can use slice notation:
intro = intro[2:]
Instead of knowing where to start, if you know the characters to remove then you could use the lstrip() function:
intro = intro.lstrip("<>")
intro="These are unwanted characters <> I'm Tom"
indx = intro.find("I")#position of 'I'
intro = intro[indx:]
print(intro)
import re
intro = "<>I'm Tom."
re.sub(r'<>I', 'I', intro)
str = "<>I'm Tom."
temp = str.split("I",1)
temp[0]=temp[0].replace("<>","")
str = "I".join(temp)
Use re.sub
. Just match all the chars upto I
then replace the matched chars with I
.
re.sub(r'^.*?I', 'I', stri)