Case:
I want to match line 1 (only has \'hello\' once!) DO NOT w
A simple option is this (using the multiline flag and not dot-all):
^(?!.*\bhello\b.*\bhello\b).*\bhello\b.*$
First, check you don't have 'hello' twice, and then check you have it at least once.
There are other ways to check for the same thing, but I think this one is pretty simple.
Of course, you can simple match for \bhello\b
and count the number of matches...
Since you're only worried about words (ie tokens separated by whitespace), you can just split on spaces and see how often "hello"
appears. Since you didn't mention a language, here's an implementation in Perl:
use strict;
use warnings;
my $a1="ehello goodbye hellot hello goodbye";
my $a2="ehello goodbye hello hello goodbye";
my @arr1=split(/\s+/,$a1);
my @arr2=split(/\s+/,$a2);
#grab the number of times that "hello" appears
my $num_hello1=scalar(grep{$_ eq "hello"}@arr1);
my $num_hello2=scalar(grep{$_ eq "hello"}@arr2);
print "$num_hello1, $num_hello2\n";
The output is
1, 2
A generic regex would be:
^(?:\b(\w+)\b\W*(?!.*?\b\1\b))*\z
Altho it could be cleaner to invert the result of this match:
\b(\w+)\b(?=.*?\b\1\b)
This works by matching a word and capturing it, then making sure with a lookahead and a backreference that it does/not follow anywhere in the string.