Anagram Algorithm in PHP

13,508

Solution 1

(I'm adding this as a separate answer, as it's a different way of dealing with the issue than I mentioned in my first issue)

This is a more complex way of working out which words in the dictionary are part of the word that you're looking for; I'll leave it up to the reader to work out how it works.

It's using factorisation to work out whether a word is an anagram of another. What it will do is assign each letter a unique, prime value; you can calculate the value of the letters in a given word by multiplying all the values together. CAT, for example, is 37 * 5 * 3, or 510. If your target word factors to the same number, you can be sure that the one is an anagram of the other.

I've ordered the prime numbers by how common they are in UK English, to keep the factors generated smaller.

<?php

function factorise($word)
{
    // Take a number, split it into individual letters, and multiply those values together
    // So long as both words use the same value, you can amend the ordering of the factors 
    // as you like

    $factors = array("e" => 2, "t" => 3, "a" => 5, "o" => 7, "i" => 11,
        "n" => 13, "s" => 17, "h" => 19, "r" => 23, "d" => 29,
        "l" => 31, "c" => 37, "u" => 41, "m" => 43, "w" => 47,
        "f" => 53, "g" => 59, "y" => 61, "p" => 67, "b" => 71,
        "v" => 73, "k" => 79, "j" => 83, "x" => 89, "q" => 97,
        "z" => 101);

    $total = 1;

    $letters = str_split($word);

    foreach ($letters as $thisLetter) {
        if (isset($factors[$thisLetter])) {
            // This will skip any non-alphanumeric characters.
            $total *= $factors[$thisLetter];
        }
    }

    return $total;
}

$searchWord = "hasted";

$dict = array("abde", "des", "klajsd", "ksj", "hat", "hats");

$searchWordFactor = factorise($searchWord);

foreach ($dict as $thisWord) {
    // Factorise each word that we're looking for
    // If the word we've just factored is an exact divisor of the target word, then all the 
    // letters in that word are also present in the target word
    // If you want to do an exact anagram, then check that the two totals are equal

    $dictWordFactor = factorise($thisWord);

    if (($searchWordFactor % $dictWordFactor) == 0) {
        print ($thisWord . " is an anagram of " . $searchWord . "<br/>");
    }
}

For what it's worth, I think this is a much more elegant solution - you can speed it up by pre-calculating the values in your dictionary. If you go through and work out the factors for every word in your dictionary, you can do the searching direct in the database:

SELECT word FROM dictionary WHERE wordFactor='$factorOfThisWord'

Solution 2

I can't quite follow what your code is doing; but if you want a simple anagram checker, the pseudocode would be something like:

get array of letters in my anagram
for each word in the dictionary
    get array of letters in this word
    for each letter in my anagram
        is this letter also in the word?
            if no, move on to the next word
    if we get here, it's an anagram

There are a couple of extra things you can do - you can make sure that both the anagram and the dictionary word are the same length (if they're not, they can't be anagrams); and you'll also need to figure out how to deal with letters that occur multiple times in the dictionary word but only once in the anagram word (the above code would report 'aa' as an anagram of 'a', for example)

Share:
13,508

Related videos on Youtube

khiemnn
Author by

khiemnn

totally newbie

Updated on June 04, 2022

Comments

  • khiemnn
    khiemnn almost 2 years

    I'm totally a newbie with PHP. Today I just got a problem that I can't know how to solve, even after searching google and digging SOF. It's the Anagram algorithm.

    So basically, I understand the problem here : When user input a string, I split it and compare with my library (a given array), then I will have to join it by 2-3-...etc characters to compare again, it's exactly where I'm stuck now, I don't know how to join the elements of the array.

    Here is the code that I'm implementing, and also a sample dictionary.

    I have a self-made dictionary with these elements in the array $dict. And i have a form for users to input a string, the string inputted will be passed to the code below and declared as $anagram. I have to split the string inputted to compare with my dictionary. But I don't know how to join them like comparing 2 letters, 3 letters...etc...and so on, to the dictionary.

    <?php
    
    $dict = array(
    'abde',
    'des',
    'klajsd',
    'ksj',
    'hat',
    'good',
    'book',
    'puzzle',
    'local',
    'php',
    'e');
    
    $anagram = $_POST['anagram'];
    //change to lowercase
    $anagram = strtolower($anagram);
    
    //split the string
    $test = str_split($anagram);
    
    //compare with $dict for the first split without joining
    for ($i=0; $i<strlen($anagram); $i++) {
        if ($test[$i]==$dict[$i]) {
            echo $test[$i]."<br />";
        }
    }
    
    //problem: how to join elements of the array in the loops
    //like user inputs "hellodes"
    //after echo "e", how to join the elements like: h-e,h-l,h-l,h-o,h-d,h-e,h-s
    //and then h-e-l,h-e-l,h-e-o...etc...
    ?>
    

    I hope to get the algorith as simple as possible because I'm totally a newbie. And I'm sorry because my english is not so good. Best regards, Khiem Nguyen.

    • Admin
      Admin almost 12 years
    • khiemnn
      khiemnn almost 12 years
      Thanks Gerep, I've read through them but it's like useless because it's too complicated that I cannot understand. I expect to have a simpler algorithm, by just joining elements of the string by using the loops and compare it with the library.
    • gunnx
      gunnx almost 12 years
      would it not be better to sort the anagram characters alphabetically and in the loop do the same for each dictionary word. if the anagram is a substring of the dictionary word then its an anagram
    • Jeff Puckett
      Jeff Puckett over 7 years
      Here's a one-line answer to this question stackoverflow.com/a/32156857/4233593
  • khiemnn
    khiemnn almost 12 years
    I'm sorry i think i put you guys in the middle of the trouble. From the start, there is a form for the users to input an arbitrary word, that explains why there is a $_POST there. @andrewsi I think your pseudo code has something wrong isn't it? Because you have to split the string user inputted, then join them back to compare, because maybe in the $dict just got something 1 letter only, like "a", "e",etc...
  • andrewsi
    andrewsi almost 12 years
    Why do you need to join the string back together to compare them? The logic above will split both the search word and the dictionary words into arrays, and compare the contents of each array; it doesn't matter if the dictionary words is one letter - you'll end up with an array that just has one item in it.
  • khiemnn
    khiemnn almost 12 years
    I have to split because of this : for example, my dictionary above contains 'hat' and 'e', and the string user inputs is 'hatedes'. The main goal is to print out the anagram matched with the dict, so this time it will print out 'hat' 'e' and 'des' because the dict contains it. If you compare the contents of each array, how if the array the user inputs is longer than the array of the dictionary?
  • khiemnn
    khiemnn almost 12 years
    Thanks gunnx, but i have this to wonder about. For example my dictionary have the word 'hat', then you sort it, it become 'aht', the string the user inputs is 'ath'. So if you sort both of them, they match! But have a look, the word the user inputted does not match the dict (ath and hat).
  • andrewsi
    andrewsi almost 12 years
    It doesn't matter how long the arrays are; you just need to check to make sure that the contents of the one match the contents of the other. So if the dictionary word is 'a', and the word entered is 'aardvark', you'd match because the a is there. It doesn't matter what the other letters in the word are.
  • andrewsi
    andrewsi almost 12 years
    I didn't say anything about sorting - did you mean that for someone else?
  • khiemnn
    khiemnn almost 12 years
    I'm pretty sorry I'm all messed up in my head. Your pseudocode seems to be correct. I will try implementing it in the morning because it's late here and I'm fully messed up with both making algorithm and coding :( Thank you indeed !
  • gunnx
    gunnx almost 12 years
    You sort the input word too, as shown in the code $anagramSorted
  • khiemnn
    khiemnn almost 12 years
    It's wrong with the algorithm if the length of the inputted string is longer than the length of one of the word in the dict. The dict has 'hat', if the user input 'hats' then it will never get an anagram. Still crazy with the algorithm.
  • khiemnn
    khiemnn almost 12 years
    If you sort both the inputted string and the word in the dict, It's totally changed! Like my example above I can give you more : the dict has 'good', the user inputs 'doog', If you sort both, they totally match. But the inputted string does not match and it's not in the dict.
  • khiemnn
    khiemnn almost 12 years
    It's like the inputted string is the substring of a word in the dict, and also if a word in the dict is a substring of the inputted string, the part that exactly match the word (of the inputted string) is printed out as an anagram. I still have no algorithm right and no code was implemented, the algorithm you mentioned above is only right when the inputted string's length is lower than the length of the word in the dict.
  • andrewsi
    andrewsi almost 12 years
    In that case, you just need to swap around the comparisons, so you're looking for letters in the input string that are also in dict, instead of the other way round.
  • khiemnn
    khiemnn almost 12 years
    I don't get it yet. Do you have some sample or code that are easy for me to understand?
  • andrewsi
    andrewsi almost 12 years
    I've added a separate answer with some code in it. It's a different way of working out anagrams than I mentioned above, but the code works well.
  • khiemnn
    khiemnn almost 12 years
    Can I respectfully ask you to add comment for the code above? I don't know what the function factorise does.
  • andrewsi
    andrewsi almost 12 years
    Actually, I deliberately left the comments out; it's not that complicated a piece of code, so you should be able to figure out what it's doing. Try adding lots of var_dump calls to see what variables are being set, and take it from there.
  • gunnx
    gunnx almost 12 years
    sorry not sure what you mean but I see you've accepted an answer so hope it works for you.
  • josephtikva1
    josephtikva1 almost 12 years
    Some of us are not looking to implement this, but would still like to understand how this works. Please post comments for our sake...
  • mujaffars
    mujaffars over 8 years
    @andrewsi I am trying for long word -- Where the script is failing. eg. for $searchWord = "hastededrs"; Result are not displaying. :(
  • andrewsi
    andrewsi over 8 years
    @mujaffars - it shouldn't have any problems with something that short. All I can suggest is that you add plenty of debugging to your code, and try to figure out why it's failing. Try adding an else to the if - does that show anything, or are you getting a completely blank screen?
  • mujaffars
    mujaffars over 8 years
    @andrewsi I am getting completely blank screen. There is not have any PHP error. I have ini_set('display_errors', '1'); on top. Also I have putted echo 'end of script'; at the end. Which is displaying only and no any other words. Though if I provide small word then it is working fine.
  • andrewsi
    andrewsi over 8 years
    @mujaffars - the only thing I can think of is that maybe the result of the multiplication is large enough that PHP can't cope with it; I've not seen that issue in PHP directly, but if it works when you feed in an 8 letter word and fails when you do nothing more than add an extra letter, I can't think what else it could be
  • andrewsi
    andrewsi over 8 years
    @mujaffars - I can't think off hand of a way to fix it if that is the problem; I guess you'll have to try a different way to do things :s
  • Amrit Shrestha
    Amrit Shrestha over 3 years
    for me, the modules didn't work. had to compare directly if ($searchWordFactor == $dictWordFactor)