I have a pool of characters and I want to match all the words which are anagrams of those chars or of a subset of those chars using a regular expression.
Example: given the string "ACNE" the regex should give me these results:
- ACNE [T]
- CENA [T]
- CAN [T]
- CAAN [F]
- CANEN [F]
I've tried this solution /b[acne]{1,4}/b
but it accepts multiple repetitions of single chars.
What can I do to take each char at most one time?
The sub-anagrams of the word "acne" are the words that
- consist only of the letters
acne
- do not contain
a
more than once - do not contain
c
more than once - do not contain
n
more than once - do not contain
e
more than once
Compiling this into a regex:
^(?!.*a.*a)(?!.*c.*c)(?!.*n.*n)(?!.*e.*e)[acne]*$
Test: regexpal
Alternatively, since "acne" does not contain any letter more than once, the sub-anagrams of the word "acne" are the words that
- consist only of the letters
acne
- do not contain any letter more than once.
Compiling this into a regex:
^(?!.*(.).*\1)[acne]*$
Test: regexpal
Note: the sub-anagrams of the word "magmoid" can be matched as
^(?!.*([agoid]).*\1)(?!(.*m){3})[magoid]*$
(do not contain any of agoid
more than once, and do not contain m
more than twice)
CODE TO FIND NUMBER OF ANAGRAMS OF A WORD IN A GIVEN STRING USING REGULAR REXPRESSION
Fork the below repository for java, DataStructure, Algorithms, and Company Interview Questions Practices. Please feel free to contribute to the Repository
package com.techsqually.java.library.util.regularexpression;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class anagramStrings {
public static void main(String[] args) {
int count = findAnagramsInAGivenStrings("arpan","Hi arpan Aarpn we are testing rapan rranp anagram");
System.out.println(count);
}
/**
* <p> Use to find the number of anagrams of a word in a Given String</p>
* @param : word : is the word for which you want to find the anagrams
* @param : givenString : is the string in which you want to find the anagrams of word given
* @return : total number of anagrams of the word passed
*
* all words in which each character count is same but their order can be different
* e.g arpan and rapan are anagrams
*
* @output of above given example is 3, "arpan" , "Aarpn" and rapan are anagrams of arpan
* */
public static int findAnagramsInAGivenStrings(String word, String givenString){
word = word.toLowerCase();
givenString = givenString.toLowerCase();
HashMap<String,Integer> numberOfAnnagrams = new HashMap<>();
Matcher matcher = Pattern.compile("[" + word + "]{" + word.length() + "}").matcher(givenString);
int count = 0;
while (matcher.find()){
char[] matchWordArray = matcher.group().toCharArray();
char[] givenWordArray = word.toCharArray();
Arrays.sort(matchWordArray);
Arrays.sort(givenWordArray);
if (Arrays.equals(matchWordArray,givenWordArray)) count++;
}
return count;
}
}
来源:https://stackoverflow.com/questions/14561614/regex-find-anagrams-and-sub-anagrams