I have fetched some data from Twitter using python. now I want to pre process it. how can I remove usernames if the tweet has username between two words and there is no space am
Code
def remove_pattern(input_txt, pattern):
r = re.findall(pattern, input_txt)
for i in r:
input_txt = re.sub(i, '', input_txt)
return input_txt
combi['tidy_tweet'] = np.vectorize(remove_pattern)(combi['tweet'], "@[\w]*")
Result:-
Thanks :)
import re
Tweet = "Hello@username"
Tweet = re.sub('@[\w]+','',Tweet)
Building on @NegiBabu's solution, Twitter only allows alphanumeric handles and so [\w] works as a better regex for this task. For e.g. with my proposed regex you wouldn't allow for @app#le to be matched.
import re
Tweet = "Hello@username"
Tweet = re.sub('@[^\s]+','',Tweet)
This code will remove the @username and Hello will not be removed.