问题
I’m going through a permutation/anagram problem and wanted input on the most efficient means of checking. Now, I’m doing this in Java land, and as such there is a library for EVERYTHING including sorting. The first means of checking if two string are anagrams of each other is to check length, sort them in some manner, then compare each index of said string. Code below:
private boolean validAnagram(String str, String pair) {
if(str.length() != pair.length()){
return false;
}
char[] strArr = str.toCharArray();
char[] pairArr = pair.toCharArray();
Arrays.sort(strArr);
str = new String(strArr);
Arrays.sort(pairArr);
pair = new String(pairArr);
for(int i = 0; i<str.length(); i++){
if(str.charAt(i) != pair.charAt(i)){
return false;
}
}
return true;
}
Alternatively, I figured it would be easier to check based on ascii value and avoid a check on every possible character. Code below:
private boolean validAnagram(String str, String pair) {
if(str.length() != pair.length()){
return false;
}
char[] strArr = str.toCharArray();
char[] pairArr = pair.toCharArray();
int strValue = 0;
int pairValue = 0;
for(int i =0; i < strArr.length; i++){
strValue+= (int) strArr[i];
pairValue+= (int) pairArr[i];
}
if(strValue != pairValue){
return false;
}
return true;
}
So, which is a better solution? I don’t know much about the sort that Arrays is giving me, however that’s the more common answer when I look around the old internets. Makes me wonder if I’m missing something.
回答1:
There are several ways to check whether two strings are anagrams or not . Your question is , which one is better solution . Your first solution has sorting logic. Sorting has worst case complexity of (nlogn) . Your second logic is only using one loop which has complexity O(n) .
So out of this two , your second solution which is having only O(n) complexity will be a better solution than first one .
One possible solution :
private boolean checkAnagram(String stringOne , String stringTwo){
char[] first = stringOne.toLowerCase().toCharArray();
char[] second = stringTwo.toLowerCase().toCharArray();
// if length of strings is not same
if (first.length != second.length)
return false;
int[] counts = new int[26];
for (int i = 0; i < first.length; i++){
counts[first[i]-97]++;
counts[second[i]-97]--;
}
for (int i = 0; i<26; i++)
if (counts[i] != 0)
return false;
return true;
}
回答2:
Here is a very simple implementation.
public boolean isAnagram(String strA, String strB) {
// Cleaning the strings (remove white spaces and convert to lowercase)
strA = strA.replaceAll("\\s+","").toLowerCase();
strB = strB.replaceAll("\\s+","").toLowerCase();
// Check every char of strA and removes first occurence of it in strB
for (int i = 0; i < strA.length(); i++ ) {
if (strB.equals("")) return false; // strB is already empty : not an anagram
strB = strB.replaceFirst(Pattern.quote("" + strA.charAt(i)), "");
}
// if strB is empty we have an anagram
return strB.equals("");
}
And finally :
System.out.println(isAnagram("William Shakespeare", "I am a weakish speller")); // true
回答3:
This is a much simpler, easy-to-read solution I was able to compile...
static boolean isAnagram(String a, String b) {
if (a.length() == b.length()){
char[] arr1 = a.toLowerCase().toCharArray();
char[] arr2 = b.toLowerCase().toCharArray();
Arrays.sort(arr1);
Arrays.sort(arr2);
if (Arrays.equals(arr1, arr2)) return true;
else return false;
}else return false;
}
Best, Justin
回答4:
I tried a few solutions using Sets, and made each one run 10 million times to test using your example array of:
private static String[] input = {"tea", "ate", "eat", "apple", "java", "vaja", "cut", "utc"};
Firstly, the method i used to call these algotirhms:
public static void main(String[] args) {
long startTime = System.currentTimeMillis();
for (int x = 0; x < 10000000; x++) {
Set<String> confirmedAnagrams = new HashSet<>();
for (int i = 0; i < (input.length / 2) + 1; i++) {
if (!confirmedAnagrams.contains(input[i])) {
for (int j = i + 1; j < input.length; j++) {
if (isAnagrams1(input[i], input[j])) {
confirmedAnagrams.add(input[i]);
confirmedAnagrams.add(input[j]);
}
}
}
}
output = confirmedAnagrams.toArray(new String[confirmedAnagrams.size()]);
}
long endTime = System.currentTimeMillis();
System.out.println("Total time: " + (endTime - startTime));
System.out.println("Average time: " + ((endTime - startTime) / 10000000D));
}
I then used algorithms based on a HashSet of characters. I add each character of each word to the HashSet, and should the HashSet not be the length of the initials words, it would mean they are not anagrams.
My algorithms and their runtimes:
Algorithm 1:
private static boolean isAnagrams1(String x, String y) {
if (x.length() != y.length()) {
return false;
} else if (x.equals(y)) {
return true;
}
Set<Character> anagramSet = new HashSet<>();
for (int i = 0; i < x.length(); i++) {
anagramSet.add(x.charAt(i));
anagramSet.add(y.charAt(i));
}
return anagramSet.size() != x.length();
}
This has the runtime of:
Total time: 6914
Average time: 6.914E-4
Algorithm 2
private static boolean isAnagrams2(String x, String y) {
if (x.length() != y.length()) {
return false;
} else if (x.equals(y)) {
return true;
}
Set<Character> anagramSet = new HashSet<>();
char[] xAr = x.toCharArray();
char[] yAr = y.toCharArray();
for (int i = 0; i < xAr.length; i++) {
anagramSet.add(xAr[i]);
anagramSet.add(yAr[i]);
}
return anagramSet.size() != x.length();
}
Has the runtime of:
Total time: 8752
Average time: 8.752E-4
Algorithm 3
For this algorithm, I decided to send the Set through, therefore I only create it once for every cycle, and clear it after each test.
private static boolean isAnagrams3(Set<Character> anagramSet, String x, String y) {
if (x.length() != y.length()) {
return false;
} else if (x.equals(y)) {
return true;
}
for (int i = 0; i < x.length(); i++) {
anagramSet.add(x.charAt(i));
anagramSet.add(y.charAt(i));
}
return anagramSet.size() != x.length();
}
Has the runtime of:
Total time: 8251
Average time: 8.251E-4
Algorithm 4
This algorithm is not mine, it belongs to Pratik Upacharya
which answered the question as well, in order for me to compare:
private static boolean isAnagrams4(String stringOne, String stringTwo) {
char[] first = stringOne.toLowerCase().toCharArray();
char[] second = stringTwo.toLowerCase().toCharArray();
// if length of strings is not same
if (first.length != second.length) {
return false;
}
int[] counts = new int[26];
for (int i = 0; i < first.length; i++) {
counts[first[i] - 97]++;
counts[second[i] - 97]--;
}
for (int i = 0; i < 26; i++) {
if (counts[i] != 0) {
return false;
}
}
return true;
}
Has the runtime of:
Total time: 5707
Average time: 5.707E-4
Of course, these runtimes do differ for every test run, and in order to do proper testing, a larger example set is needed, and maybe more iterations thereof.
*Edited, as I made a mistake in my initial method, Pratik Upacharya's
algorithm does seem to be the faster one
回答5:
The best solution depends on your objective, code size, memory footprint or least computation.
A very cool solution, less code as possible, not being the fastest O(nlog n) and pretty memory inefficient in Java 8 :
public class Anagram {
public static void main(String[] argc) {
String str1 = "gody";
String str2 = "dogy";
boolean isAnagram =
str1.chars().mapToObj(c -> (char) c).sorted().collect(Collectors.toList())
.equals(str2.chars().mapToObj(c -> (char) c).sorted().collect(Collectors.toList()));
System.out.println(isAnagram);
}
}
回答6:
My solution : Time Complexity = O(n)
public static boolean isAnagram(String str1, String str2) {
if (str1.length() != str2.length()) {
return false;
}
for (int i = 0; i < str1.length(); i++) {
char ch = str1.charAt(i);
if (str2.indexOf(ch) == -1)
return false;
else
str2 = str2.replaceFirst(String.valueOf(ch), " ");
}
return true;
}
Test case :
@Test
public void testIsPernutationTrue() {
assertTrue(Anagram.isAnagram("abc", "cba"));
assertTrue(Anagram.isAnagram("geeksforgeeks", "forgeeksgeeks"));
assertTrue(Anagram.isAnagram("anagram", "margana"));
}
@Test
public void testIsPernutationFalse() {
assertFalse(Anagram.isAnagram("abc", "caa"));
assertFalse(Anagram.isAnagram("anagramm", "marganaa"));
}
回答7:
//here best solution for an anagram
import java.util.*;
class Anagram{
public static void main(String arg[]){
Scanner sc =new Scanner(System.in);
String str1=sc.nextLine();
String str2=sc.nextLine();
int i,j;
boolean Flag=true;
i=str1.length();
j=str2.length();
if(i==j){
for(int m=0;m<i;m++){
for(int n=0;n<i;n++){
if(str1.charAt(m)==str2.charAt(n)){
Flag=true;
break;
}
else
Flag=false;
}
}
}
else{
Flag=false;
}
if(Flag)
System.out.println("String is Anagram");
else
System.out.println("String is not Anagram");
}
}
回答8:
A recruiter asked me to solve this problem recently. In studying the problem I came up with a solution that solves two types of anagram issues.
issue 1: Determine if an anagram exists within a body of text.
issue 2:
Determine if a formal anagram exist within a body of text.
In this case the anagram must be of the same size as the text you are
comparing it against. In the former case, the two texts need not be the same size.
One just needs to contain the other.
My approach was as follows:
setup phase: First create an anagram Class. This will just convert the text to a Map whose with key the character in question and the value contains the number of occurrences of the input character. I assume that at most this would require O(n) time complexity. And since this would require two maps at most, worst case complexity would be O(2n). At least my naive understanding of Asymptotic notations says that.
processing phase: All you need do is loop thru the smaller of the two Maps and look it up in the larger Map. If it does not exist or if it exists but with a different occurrence count, it fails the test to be an anagram.
Here is the loop that determines if we have an anagram or not:
boolean looking = true;
for (Anagram ele : smaller.values()) {
Anagram you = larger.get(ele);
if (you == null || you.getCount() != ele.getCount()) {
looking = false;
break;
}
}
return looking;
Note that I create a ADT to contain the strings being processed. They are converted to a Map first.
Here is a snippet of the code to create the Anagram Object:
private void init(String teststring2) {
StringBuilder sb = new StringBuilder(teststring2);
for (int i = 0; i < sb.length(); i++) {
Anagram a = new AnagramImpl(sb.charAt(i));
Anagram tmp = map.putIfAbsent(a, a);
if (tmp != null) {
tmp.updateCount();
}
}
}
回答9:
I came up with a solution that takes O(n) time and I am not even using any 26 char array... Check this out:
StringBuffer a = new StringBuffer();
a.append(sc.next().toLowerCase());
StringBuffer b = new StringBuffer();
b.append(sc.next().toLowerCase());
if(a.length() !=b.length())
{
System.out.println("NO");
continue;
}
int o =0;
for(int i =0;i<a.length();i++)
{
if(a.indexOf(String.valueOf(b.charAt(i)))<0)
{
System.out.println("NO");
o=1;break;
}
}
if(o==0)
System.out.println("Yes");
回答10:
Consider using HashMap and Arrays.sort
private static Map<String, String> getAnagrams(String[] data) {
Map<String, String> anagrams = new HashMap<>();
Map<String, String> results = new HashMap<>();
for (int i = 0; i < data.length; i++) {
char[] chars = data[i].toLowerCase().toCharArray();
Arrays.sort(chars);
String sorted = String.copyValueOf(chars);
String item = anagrams.get(sorted);
if (item != null) {
anagrams.put(sorted, item + ", " + i);
results.put(sorted, anagrams.get(sorted));
} else {
anagrams.put(sorted, String.valueOf(i));
}
}
return results;
}
I like it as you only traverse array only once.
来源:https://stackoverflow.com/questions/38229648/best-solution-for-an-anagram-check