问题
I'm trying to extract character before and after "/" with no success. Sentences are:
XXXX YYY ZZZ - AV HAHEHRS, 3061 - SDDW ASDA DDSF - SAO JOSE DOS CAMPOS / SP - CEP: 00000-000
Output should be
SAO JOSE DOS CAMPOS / SP
I'm trying str_extract(str, "- [a-zA-Z]{1,} / [a-zA-Z]{1,}")
but it's just bringing me
CAMPOS / SP
回答1:
In your regex there is the space missing. Try:
str_extract(str, "- [a-zA-Z ]+ / [a-zA-Z ]+")
Note the space in the character class. Also, {1,}
is the long form of +
.
The match will be "- SAO JOSE DOS CAMPOS / SP - CEP"
. You must get rid of the -
in a second step, or use a zero-width look-behind:
str_extract(str, "(?<=- )[a-zA-Z ]+ / [a-zA-Z ]+")
Look-behinds are supported by gregexpr.
For the sake of completeness, you could do this without regex: Split the input by '-'
, find the part that contains '/'
, trim. This might be faster than regex, too.
来源:https://stackoverflow.com/questions/48088845/extract-character-before-and-after