I have data where the words as follows
location<- c("xyz, sss, New Zealand", "USA", "Pris,France")
id<- c(1,2,3)
df<-data.frame(location,id)
I would like to extract the country name from the data. The tricky part is if i extract just the last word then I will have only one record (France).
library(stringr)
df$country<- word(df$location,-1)
Any ideas on how to extract country data from this data?
id location country
1 xyz, sss, New Zealand New Zealand
2 USA USA
3 Pris,France France
You can try sub
df$country <- sub('.*,\\s*', '', df$location)
df$country
#[1] "New Zealand" "USA" "France"
Or
library(stringr)
str_extract(df$location, '\\b[^,]+$')
#[1] "New Zealand" "USA" "France"
stringi
solution:
require(stringi)
location<- c("xyz, sss, New Zealand", "USA", "Pris,France")
stri_trim(stri_match_first_regex(location, "(^|,)([^,]*?)$")[,3])
## [1] "New Zealand" "USA" "France"
stri_trim
removes unnecessary spaces before/after country name.
来源:https://stackoverflow.com/questions/31148828/extract-last-word-in-a-string-after-comma-if-there-are-multiple-words-else-the-f