R regex - extract words beginning with @ symbol

后端未结

关注

 3  1817

太阳男子 2021-01-18 04:09

I\'m trying to extract twitter handles from tweets using R\'s stringr package. For example, suppose I want to get all words in a vector that begin with \"A\". I can do this

3条回答

傲寒 (楼主)

2021-01-18 04:28
It looks like you probably mean
```
str_extract_all(c("h@i", "hi @hello @me", "@twitter"), "(?<=^|\\s)@[^\\s]+")
# [[1]]
# character(0)
# [[2]]
# [1] "@hello" "@me" 
# [[3]]
# [1] "@twitter"
```
The \b in a regular expression is a boundary and it occurs "Between two characters in the string, where one is a word character and the other is not a word character." see here. Since an space and "@" are both non-word characters, there is no boundary before the "@".

With this revision you match either the start of the string or values that come after spaces.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...