For example reassesses will match. It contains exactly 4 different characters: \'r\', \'e\', \'a\' and \'s\'.
My attempt is: /^([a-z
Definetly works -
This should cause an alignment that only composes 4 distinct char's of
a string >= 4 in length.
# ^(?=.*(.).*(?!\1)(.).*(?!\1|\2)(.).*(?!\1|\2|\3)(.))(?:\1|\2|\3|\4)+$
^
(?=
.*
( . )
.*
(?! \1 )
( . )
.*
(?! \1 | \2 )
( . )
.*
(?! \1 | \2 | \3 )
( . )
)
(?: \1 | \2 | \3 | \4 )+
$
Perl test case:
if ("upepipipeu" =~ /^(?=.*(.).*(?!\1)(.).*(?!\1|\2)(.).*(?!\1|\2|\3)(.))(?:\1|\2|\3|\4)+$/)
{
print "unique chars: '$1' '$2' '$3' '$4'\n";
print "matched: '$&'\n";
}
Output >>
unique chars: 'i' 'p' 'e' 'u'
matched: 'upepipipeu'
Test case for @aliteralmind:
@Ary = ("aabbccdd", "dictionary", "reassess", "aaaa");
for( @Ary )
{
if ("$_" =~ /^(?=.*(.).*(?!\1)(.).*(?!\1|\2)(.).*(?!\1|\2|\3)(.))(?:\1|\2|\3|\4)+$/)
{
print "unique chars: '$1' '$2' '$3' '$4'\n";
print "matched: '$&'\n\n";
}
else
{
print "Failed-> '$_'\n\n";
}
}
Output >>
unique chars: 'a' 'b' 'c' 'd'
matched: 'aabbccdd'
Failed-> 'dictionary'
unique chars: 'r' 'a' 'e' 's'
matched: 'reassess'
Failed-> 'aaaa'
Something like this:
^([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*$
The use of possessive quantifiers is essential in this pattern, because it forbids backtracking and avoids that the following capturing group matches a letter that has been found.
The possessive quantifier feature is available in Java (don't forget to double escape backreferences), but if you need to use the pattern in a language that doesn't have this feature, you can find several options to "translate" the pattern in my comment.
The above pattern is build to check a whole string, but if you want to find words in a larger string, you can use this (with eventually the case-insensitive option):
(?<![a-z])([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*(?![a-z])
As far as regex, this is a brain-buster. Here is a non-regex solution. A function that uses a map to keep track of unique characters, and returns true when the maximum number of unique characters is reached.
import java.util.Map;
import java.util.TreeMap;
/**
<P>{@code java ExactlyFourDiffChars}</P>
**/
public class ExactlyFourDiffChars {
public static final void main(String[] ignored) {
System.out.println("aabbccdd: " + hasMoreThanXUniqueChars(4, "aabbccdd"));
System.out.println("dictionary: " + hasMoreThanXUniqueChars(4, "dictionary"));
System.out.println("reassesses: " + hasMoreThanXUniqueChars(4, "reassesses"));
}
public static final boolean hasMoreThanXUniqueChars(int maxAllowedChars, String str) {
Map<Character,Object> charMap = new TreeMap<Character,Object>();
for(int i = 0; i < str.length(); i++) {
Character C = str.charAt(i);
if(!charMap.containsKey(C)) {
charMap.put(C, null);
if(maxAllowedChars-- == 0) {
return false;
}
}
}
return true;
}
}
Output:
[C:\java_code\]java ExactlyFourDiffChars
aabbccdd: true
dictionary: false
reassesses : true
Try
^([a-z])\1*([a-z])(\1*\2*)*([a-z])(\1*\2*\4*)*([a-z])(\1*\2*\4*\6*)*$
Edit to not match less than 4 unique (e.g. aaaa):
^([a-z])\1*(?!\1)([a-z])(\1*\2*)*(?!\1)(?!\2)([a-z])(\1*\2*\4*)*(?!\1)(?!\2)(?!\4)([a-z])(\1*\2*\4*\6*)*$