I\'m trying to extract twitter handles from tweets using R\'s stringr package. For example, suppose I want to get all words in a vector that begin with \"A\". I can do this
The answer above should suffice. This will remove the @ symbol in case you are trying to get the users' names only.
str_extract_all(c("@tweeter tweet", "h@is", "tweet @tweeter2"), "(?<=\\B\\@)[^\\s]+")
[[1]]
[1] "tweeter"
[[2]]
character(0)
[[3]]
[1] "tweeter2"
While I am no expert with regex, it seems like the issue may be that the @ symbol does not correspond to a word character, and thus matching the empty string at the beginning of a word (\\b
) does not work because there is no empty string when @ is preceding the word.
Here are two great regex resources in case you hadn't seen them:
Stringr's Regex page, also available as a vignette:
vignette("regular-expressions", package = "stringr")