I have a column in a dataframe like this:
npt2$name
# [1] \"Andreas Groll, M.D.\"
# [2] \"\"
# [3] \"Pan-Chyr Yang, PHD\"
# [4] \"Suh-Fang Jeng, Sc.D\"
# [5
Here's a variant that removes the extra ", " too. Does not require touppper
either - but if you want that, just specify ignore.case=TRUE
to gsub
.
test <- c("Andreas Groll, M.D.",
"",
"Pan-Chyr Yang, PHD",
"Suh-Fang Jeng, Sc.D",
"Peter S Sebel, MB BS, PhD Chantal Kerssens, PhD",
"Lawrence Currie, MD")
gsub(",? *(MD|M\\.D\\.|P[hH]D)", "", test)
#[1] "Andreas Groll" ""
#[3] "Pan-Chyr Yang" "Suh-Fang Jeng, Sc.D"
#[5] "Peter S Sebel, MB BS Chantal Kerssens" "Lawrence Currie"