I am using R in Ubuntu, and trying to go over list of files, some of them i need and some of them i don\'t need,
I try to get the one\'s i need by finding a sub string
a
but not aa
You can use the following TRE regex:
^[^a]*a[^a]*$
It matches the start of the string (^
), 0+ chars other than a
([^a]*
), an a
, again 0+ non-'a's and the end of string ($
). See this IDEONE demo:
a <- c("aca","cac","a", "abab", "ab-ab", "ab-cc-ab")
grep("^[^a]*a[^a]*$", a, value=TRUE)
## => [1] "cac" "a"
a
but not aa
If you need to match words that have one a
only, but not two or more a
s inside in any location.
Use this PCRE regex:
\b(?!\w*a\w*a)\w*a\w*\b
See this regex demo.
Explanation:
\b
- word boundary(?!\w*a\w*a)
- a negative lookahead failing the match if there are 0+ word chars, a
, 0+ word chars and a
again right after the word boundary\w*
- 0+ word charsa
- an a
\w*
- 0+ word chars\b
- trailing word boundary.NOTE: Since \w
matches letters, digits and underscores, you might want to change it to \p{L}
or [^\W\d_]
(only matches letters).
See this demo:
a <- c("aca","cac","a")
grep("\\b(?!\\w*a\\w*a)\\w*a\\w*\\b", a, perl=TRUE, value=TRUE)
## => [1] "cac" "a"