问题
I have a column of names of the form "Hobs, Mr. jack" i.e. lastname, title. firstname. title has 4 types -"Mr.", "Mrs.","Miss.","Master." How can I search for each item in the column & return the title ,which I can store in another column ?
Name <- c("Hobs, Mr. jack","Hobs, Master. John","Hobs, Mrs. Nicole",........)
desired output - a column "title" with values - ("Mr","Master", "Mrs",.....)
I have tried something like this:
f <- function(d) {
if (grep("Mr", d$title)) {
gsub("$Mr$", "Mr", d$title, ignore.case = T)
}
}
no success >.<
回答1:
Maybe something like this:
library(stringr)
> Name <- c("Hobs, Mr. jack","Hobs, Master. John","Hobs, Mrs. Nicole")
> str_extract(string = Name,pattern = "(Mr|Master|Mrs)\\.")
[1] "Mr." "Master." "Mrs."
A fancier regex might exclude the period up front, or you could remove them in a second step.
回答2:
Considering dataset name as df and column as Name. New column name would be title.
df$Title <- gsub('(.*, )|(\\..*)', '', df$Name)
来源:https://stackoverflow.com/questions/34099119/how-can-i-extract-from-title-from-name-in-a-column