Regex - find anagrams and sub-anagrams

守給你的承諾、 提交于 2019-12-07 04:18:59

问题


I have a pool of characters and I want to match all the words which are anagrams of those chars or of a subset of those chars using a regular expression.

Example: given the string "ACNE" the regex should give me these results:

  • ACNE [T]
  • CENA [T]
  • CAN [T]
  • CAAN [F]
  • CANEN [F]

I've tried this solution /b[acne]{1,4}/b but it accepts multiple repetitions of single chars. What can I do to take each char at most one time?


回答1:


The sub-anagrams of the word "acne" are the words that

  • consist only of the letters acne
  • do not contain a more than once
  • do not contain c more than once
  • do not contain n more than once
  • do not contain e more than once

Compiling this into a regex:

^(?!.*a.*a)(?!.*c.*c)(?!.*n.*n)(?!.*e.*e)[acne]*$

Test: regexpal

Alternatively, since "acne" does not contain any letter more than once, the sub-anagrams of the word "acne" are the words that

  • consist only of the letters acne
  • do not contain any letter more than once.

Compiling this into a regex:

^(?!.*(.).*\1)[acne]*$

Test: regexpal

Note: the sub-anagrams of the word "magmoid" can be matched as

^(?!.*([agoid]).*\1)(?!(.*m){3})[magoid]*$

(do not contain any of agoid more than once, and do not contain m more than twice)




回答2:


CODE TO FIND NUMBER OF ANAGRAMS OF A WORD IN A GIVEN STRING USING REGULAR REXPRESSION

Fork the below repository for java, DataStructure, Algorithms, and Company Interview Questions Practices. Please feel free to contribute to the Repository

https://github.com/arpans2112/techsqually-java8-best-practices/blob/master/src/com/techsqually/java/library/util/regularexpression/anagramStrings.java

package com.techsqually.java.library.util.regularexpression;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class anagramStrings {


    public static void main(String[] args) {

       int count = findAnagramsInAGivenStrings("arpan","Hi arpan Aarpn we are testing rapan rranp anagram");
        System.out.println(count);
    }


    /**
     * <p> Use to find the number of anagrams of a word in a Given String</p>
     * @param : word : is the word for which you want to find the anagrams
     * @param : givenString : is the string in which you want to find the anagrams of word given
     * @return : total number of anagrams of the word passed
     *  
     *  all words in which each character count is same but their order can be different 
     *  e.g arpan and rapan are anagrams 
     *  
     * @output of above given example is 3, "arpan" , "Aarpn" and rapan are anagrams of arpan
     * */
    public static int findAnagramsInAGivenStrings(String word, String givenString){

        word = word.toLowerCase();
        givenString = givenString.toLowerCase();
        HashMap<String,Integer> numberOfAnnagrams = new HashMap<>();
       Matcher matcher = Pattern.compile("[" + word + "]{" + word.length() + "}").matcher(givenString);

       int count = 0;
        while (matcher.find()){

                 char[] matchWordArray = matcher.group().toCharArray();
                 char[] givenWordArray = word.toCharArray();
            Arrays.sort(matchWordArray);
            Arrays.sort(givenWordArray);

            if (Arrays.equals(matchWordArray,givenWordArray)) count++;
        }

        return count;
    }
}


来源:https://stackoverflow.com/questions/14561614/regex-find-anagrams-and-sub-anagrams

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!