Anagram algorithm in java

58,104

Solution 1

An easier way might be to just sort the chars in both strings, and compare whether they are equal:

public static boolean isAnagram(String s1, String s2){

        // Early termination check, if strings are of unequal lengths,
        // then they cannot be anagrams
        if ( s1.length() != s2.length() ) {
            return false;
        }
        s1=s1.toLowerCase();
        s2=s2.toLowerCase();
        char[] c1 = s1.toCharArray();
        char[] c2 = s2.toCharArray();
        Arrays.sort(c1);
        Arrays.sort(c2);
        String sc1 = new String(c1);
        String sc2 = new String(c2);
        return sc1.equals(sc2);
}

Personally I think it's more readable than nested for-loops =p

This has O(n log n) run-time complexity, where n is the length of the longer string.

Edit: this is not the optimal solution. See @aam1r's answer for the most efficient approach (i.e. what you should actually say in an interview)

Solution 2

This can be done in linear time using constant space. Here is pseudo-code to help you get started:

// Create new hashtable/hashmap to keep track of how many times each character
// is being used
character_map -> new hash map

// Initial check. If lengths are not the same, they can't be anagrams.
if s1.length != s2.length:
    throw exception "Not anagrams"

// Add all characters from s1 to hashmap. Increment the value to keep track of
// number of occurences
foreach character c1 in s1:
    character_map[c1]++

// Iterate through all character in s2 and decrement count of each character.
foreach character c2 in s2:
    character_map[c2]--

// If they are anagrams, each character should be at "0" count at the point.
// If we come across a character that is not, it means that they are not anagrams
foreach key k, value v in character_map:
    if v != 0:
            throw exception "Not anagrams"

This code does not sort and hence can be done using simple loops. The overall runtime is O(n) and overall space is O(1) -- hence being the fastest solution. The number of elements you can have in the hash map is constant (i.e. you know how many items there are in your alphabet set).

Solution 3

if(s1.charAt(i)==s2.charAt(j))
        delStr=s1.substring(i,i+1);
        newStr=s2.replace(delStr,"");

This code is a nice demonstration of why you should always have curly braces around your if, even if there is only a single statement. Your 2nd assignment is actually outside the if-condition, and will always happen.

The best way to check for two strings to be Anagram is to convert them to a character array (String#toCharArray). Then sort them using Arrays.sort method. And do the comparison on them.


Updated : -

If you just want to use String methods, then you don't actually need a nested loop there. You can do it with just one.

Here's the modified code of yours: -

public static boolean isAnagram(String s1 , String s2){

    if (s1.length() != s2.length()) {
        return false;
    }

    for(int i = 0; i < s2.length(); i++) {

            if( !s1.contains("" + s2.charAt(i))) {
                return false;
            }

            s1 = s1.replaceFirst("" + s2.charAt(i), "");
            s2 = s2.replaceFirst("" + s2.charAt(i), "");
    }
    return true;
}

Solution 4

What would be more efficient is to compare the Strings in sorted order.

public static boolean isAnagram(String s1 , String s2) {
    return s1.length() == s2.length() 
        && checkSum(s1) == checkSum(s2)
        && Arrays.equals(lettersSorted(s1), lettersSorted(s2));
}

static long checkSum(String s) {
    long sqrSum = 0;
    for(int i = 0; i < s.length(); s++) {
       char ch = s.charAt(i);
       sqrSum += ch + (1L << ch);
    }
}

static char[] lettersSorted(String s) {
    char[] chars = s.toCharArray();
    Arrays.sort(chars);
    return chars;
}

This is an O(N ln N) algorithm, but will be O(N) on an average if the Strings are typically not anagrams.

Solution 5

I'm not sure what you're trying to do, but I'm pretty sure it won't work (and it runs in O(n^2). Try this (which runs in O(n log n)) instead:

public static boolean isAnagram(String s1, String s2){
  if (s1.length() != s2.length()) return false;

  char[] c1 = s1.toCharArray();
  char[] c2 = s2.toCharArray();

  Arrays.sort(c1);
  Arrays.sort(c2);

  for(int i = 0; i < c1.length; i++) {
    if(c1[i] != c2[i]) return false;
  }

  return true;
}
Share:
58,104
SemihY
Author by

SemihY

#SOreadytohelp

Updated on November 09, 2021

Comments

  • SemihY
    SemihY over 2 years

    I would like to make anagram algorithm but This code doesn't work. Where is my fault ? For example des and sed is anagram but output is not anagram Meanwhile I have to use string method. not array. :)

    public static boolean isAnagram(String s1 , String s2)
    {
        String delStr="";
        String newStr="";
    
        for(int i=0;i<s1.length();i++)
        {
            for(int j=0 ; j < s2.length() ; j++)
            {
                if(s1.charAt(i)==s2.charAt(j))
                {
                    delStr=s1.substring(i,i+1);
                    newStr=s2.replace(delStr,"");
                }
            }           
        }
    
        if(newStr.equals(""))
            return true;
        else
            return false;
    }
    
  • andreih
    andreih over 11 years
    Yeah, i agree. It is always recommended to have curly braces around an if.
  • Rohit Jain
    Rohit Jain over 11 years
    @user1873885. May be. But to get it to work, you need to tell what is an Anagram? I can't understand the logic of the code.
  • PermGenError
    PermGenError over 11 years
    @durron597 yepp, the code still doesnt work, thats why its not accepted yet :P, upvotes are for pointing the blunder in the code .
  • Rohit Jain
    Rohit Jain over 11 years
    @durron597.. Ah! Actually, this was the only logical error that I could notice in the code, because I don't know what Anagram OP is talking about.
  • Vishy
    Vishy over 11 years
    lol. +1 You can use Arrays.equals() to compare the sorted arrays.
  • Vishy
    Vishy over 11 years
    +1 If you are going to check the length, I would do it before calling toCharArray.
  • durron597
    durron597 over 11 years
    @GanGnaMStYleOverFlowErroR: There are now three correct answers in this thread, all of which have fewer upvotes than this answer. Rohit, you should give yourself the disciplined badge now ;)
  • sampson-chen
    sampson-chen over 11 years
    Haha, that's 3 of us so far - adage about great minds and all, eh? ;)
  • durron597
    durron597 over 11 years
    @PeterLawrey: Or use a simple loop, which is equivalent. But not this
  • Rohit Jain
    Rohit Jain over 11 years
    @durron597.. Lol.. Do you really want me to have that? :( I have updated my answer, not with some code though, as OP has already got 3 of them.
  • PermGenError
    PermGenError over 11 years
    @durron597 haha, yeah, (upvotes!=accepted). and i did upvote all the 3 answers(including yours) :), however i think Rohits answer is quite useful:)
  • Vishy
    Vishy over 11 years
    @sampson-chen So I added something different. ;)
  • durron597
    durron597 over 11 years
    @PeterLawrey lol that would be faster, too bad I already upvoted you ;)
  • user1103205
    user1103205 over 11 years
    opps, didnt know so many people were posting the same thing. My bad.
  • sampson-chen
    sampson-chen over 11 years
    @user1873885 Unfortunately "isn't working" does not give me any information that I can use to help you with =(
  • SemihY
    SemihY over 11 years
    @sampson-chen I don't understand problem. I write des and sed but input is not anagram. Meanwhile I dont use array class. Only string methods
  • sampson-chen
    sampson-chen over 11 years
    @user1873885 you need to show us the full code, particularly 1) where you are using isAnagram, 2) what is expected input / output and 3) what you are actually getting
  • Rohit Jain
    Rohit Jain over 11 years
    @durron597. There you go. I added a code, for it to be a valid answer. ;)
  • durron597
    durron597 over 11 years
    @RohitJain: What, you don't want a Disciplined badge? :) Also, that code doesn't work, consider the strings "abbbccc" and "aaaaabc"
  • Ted Hopp
    Ted Hopp over 11 years
    @user1873885 - Assuming you have fixed the braces on the if, your code is not working because, after each loop iteration, newStr is set to the result of deleting only the last match from s2.
  • Rohit Jain
    Rohit Jain over 11 years
    @durron597.. No :( I have always had an image of in-disciplined guy. I don't want to loose that image.
  • Rohit Jain
    Rohit Jain over 11 years
    @durron597.. Let me modify it.
  • durron597
    durron597 over 11 years
    Ahh bucket sort, my hero. +1. By the way, this is O(1) space, not O(n)
  • Undo
    Undo about 11 years
    Maybe some more explanation?
  • Koray Tugay
    Koray Tugay over 10 years
    If lengths are not the same, they can't be anagrams. -> First remove the whitespaces.
  • goroncy
    goroncy about 10 years
    Calling a map 'table' :(. Using continue and break when not necessary. Eh.
  • Sahil Madan
    Sahil Madan about 10 years
    @goroncy I would like you to modify and post if you want to make it better. "table" is just a random name, I am not using this code in some real-life product. Was just practicing.
  • goroncy
    goroncy about 10 years
    Yeah. Continue and break still remains though.
  • realPK
    realPK about 10 years
    @sampson-chen: I would assume abuse by user while using this function and add this check: code if (s1 == null || s2 == null) { return false; } code Else it will throw NPE when either of s1 or s2 is null.
  • Mani
    Mani about 10 years
    One suggestion. you can get anagram.get(c) and check null instead of containsKey ( you are iterating two times just to check )
  • durron597
    durron597 almost 10 years
    Dear all future readers, this is by far one of the least performant answers. Do not use
  • sampson-chen
    sampson-chen almost 10 years
    @durron597 - yup, aam1r's answer is the optimal solution, vote that one up instead.
  • Ömer Faruk Almalı
    Ömer Faruk Almalı about 9 years
    @sampson-chen could you please explain where does logn comes from?
  • sampson-chen
    sampson-chen about 9 years
    @ÖmerFarukAlmalı O(n log n) is from comparison-based sorting: en.wikipedia.org/wiki/Comparison_sort
  • marlon
    marlon almost 9 years
    So the space complexity is O(n) due to the additional char arrays?
  • greybeard
    greybeard over 8 years
    (I don't quite like the name empty_map.) You can use old = char_map.put(c, 1); if (null != old) char_map.put(c, old + 1) for one lookup less per character.
  • arsenal
    arsenal about 8 years
    what is the purpose of checkSum method here?
  • Vishy
    Vishy about 8 years
    @lining the purpose of the checkSum is to perform an O(n) check which typically will detect which strings don't match.
  • Tom
    Tom over 7 years
    You should test your code before posting it as an answer.
  • xenteros
    xenteros almost 7 years
    @SahilMadan I did a basic refactor of your code. I think that you can do the rest.
  • Ihor M.
    Ihor M. about 5 years
    this algorithm has time complexity of O(N) and space complexity of O(N)
  • kaya3
    kaya3 over 4 years
    It's a little bit simpler to use Arrays.equals here rather than converting back to strings. docs.oracle.com/javase/8/docs/api/java/util/…
  • kaya3
    kaya3 over 4 years
    The for loop here can be replaced with return Arrays.equals(c1, c2);. docs.oracle.com/javase/8/docs/api/java/util/…
  • kaya3
    kaya3 over 4 years
    In addition to checking the sum, you could also check the XOR in the same loop. The combination of sum and XOR should make "collisions" considerably rarer.
  • kaya3
    kaya3 over 4 years
    If you want to test for anagrams in a case-insensitive way, you should convert both strings to lowercase (or uppercase) before sorting. Otherwise {'Z', 'a'} and {'A', 'z'} are both in sorted order and the strings won't be equal ignoring case. The second "iterative" algorithm isn't correct for the strings "aab" and "abb".
  • Reejesh PK
    Reejesh PK over 3 years
    you can also add a break statement inside of index check if statement to make it more efficient
  • Mykyta Chelombitko
    Mykyta Chelombitko over 2 years
    The only really worth of reading answer and solution.
  • goroncy
    goroncy over 2 years
    Hehe. Thanks :).