I have a vector s
of strings (or NAs), and would like to get a vector of same length of everything before first occurrence of punctionation (.
).
You can remove all symbols (incl. a newline) from the first dot with the following Perl-like regex:
s <- c("ABC1.2", "22A.2", NA)
gsub("[.][\\s\\S]*$", "", s, perl=T)
## => [1] "ABC1" "22A" NA
See IDEONE demo
The regex matches
[.]
- a literal dot[\\s\\S]*
- any symbols incl. a newline$
- end of string.All matched strings are removed from the input with ""
. As the regex engine analyzes the string from left to right, the first dot is matched with \\.
, and the greedy *
quantifier with [\\s\\S]
will match all up to the end of string.
If there are no newlines, a simpler regex will do: [.].*$
:
gsub("[.].*$", "", s)
See another demo