Anagram Algorithm in PHP
Solution 1
(I'm adding this as a separate answer, as it's a different way of dealing with the issue than I mentioned in my first issue)
This is a more complex way of working out which words in the dictionary are part of the word that you're looking for; I'll leave it up to the reader to work out how it works.
It's using factorisation to work out whether a word is an anagram of another. What it will do is assign each letter a unique, prime value; you can calculate the value of the letters in a given word by multiplying all the values together. CAT, for example, is 37 * 5 * 3, or 510. If your target word factors to the same number, you can be sure that the one is an anagram of the other.
I've ordered the prime numbers by how common they are in UK English, to keep the factors generated smaller.
<?php
function factorise($word)
{
// Take a number, split it into individual letters, and multiply those values together
// So long as both words use the same value, you can amend the ordering of the factors
// as you like
$factors = array("e" => 2, "t" => 3, "a" => 5, "o" => 7, "i" => 11,
"n" => 13, "s" => 17, "h" => 19, "r" => 23, "d" => 29,
"l" => 31, "c" => 37, "u" => 41, "m" => 43, "w" => 47,
"f" => 53, "g" => 59, "y" => 61, "p" => 67, "b" => 71,
"v" => 73, "k" => 79, "j" => 83, "x" => 89, "q" => 97,
"z" => 101);
$total = 1;
$letters = str_split($word);
foreach ($letters as $thisLetter) {
if (isset($factors[$thisLetter])) {
// This will skip any non-alphanumeric characters.
$total *= $factors[$thisLetter];
}
}
return $total;
}
$searchWord = "hasted";
$dict = array("abde", "des", "klajsd", "ksj", "hat", "hats");
$searchWordFactor = factorise($searchWord);
foreach ($dict as $thisWord) {
// Factorise each word that we're looking for
// If the word we've just factored is an exact divisor of the target word, then all the
// letters in that word are also present in the target word
// If you want to do an exact anagram, then check that the two totals are equal
$dictWordFactor = factorise($thisWord);
if (($searchWordFactor % $dictWordFactor) == 0) {
print ($thisWord . " is an anagram of " . $searchWord . "<br/>");
}
}
For what it's worth, I think this is a much more elegant solution - you can speed it up by pre-calculating the values in your dictionary. If you go through and work out the factors for every word in your dictionary, you can do the searching direct in the database:
SELECT word FROM dictionary WHERE wordFactor='$factorOfThisWord'
Solution 2
I can't quite follow what your code is doing; but if you want a simple anagram checker, the pseudocode would be something like:
get array of letters in my anagram
for each word in the dictionary
get array of letters in this word
for each letter in my anagram
is this letter also in the word?
if no, move on to the next word
if we get here, it's an anagram
There are a couple of extra things you can do - you can make sure that both the anagram and the dictionary word are the same length (if they're not, they can't be anagrams); and you'll also need to figure out how to deal with letters that occur multiple times in the dictionary word but only once in the anagram word (the above code would report 'aa' as an anagram of 'a', for example)
Related videos on Youtube
Comments
-
khiemnn almost 2 years
I'm totally a newbie with PHP. Today I just got a problem that I can't know how to solve, even after searching google and digging SOF. It's the Anagram algorithm.
So basically, I understand the problem here : When user input a string, I split it and compare with my library (a given array), then I will have to join it by 2-3-...etc characters to compare again, it's exactly where I'm stuck now, I don't know how to join the elements of the array.
Here is the code that I'm implementing, and also a sample dictionary.
I have a self-made dictionary with these elements in the array $dict. And i have a form for users to input a string, the string inputted will be passed to the code below and declared as $anagram. I have to split the string inputted to compare with my dictionary. But I don't know how to join them like comparing 2 letters, 3 letters...etc...and so on, to the dictionary.
<?php $dict = array( 'abde', 'des', 'klajsd', 'ksj', 'hat', 'good', 'book', 'puzzle', 'local', 'php', 'e'); $anagram = $_POST['anagram']; //change to lowercase $anagram = strtolower($anagram); //split the string $test = str_split($anagram); //compare with $dict for the first split without joining for ($i=0; $i<strlen($anagram); $i++) { if ($test[$i]==$dict[$i]) { echo $test[$i]."<br />"; } } //problem: how to join elements of the array in the loops //like user inputs "hellodes" //after echo "e", how to join the elements like: h-e,h-l,h-l,h-o,h-d,h-e,h-s //and then h-e-l,h-e-l,h-e-o...etc... ?>
I hope to get the algorith as simple as possible because I'm totally a newbie. And I'm sorry because my english is not so good. Best regards, Khiem Nguyen.
-
Admin almost 12 yearsfound two links: sourceforge.net/projects/phpag and phpclasses.org/browse/file/12539.html
-
khiemnn almost 12 yearsThanks Gerep, I've read through them but it's like useless because it's too complicated that I cannot understand. I expect to have a simpler algorithm, by just joining elements of the string by using the loops and compare it with the library.
-
gunnx almost 12 yearswould it not be better to sort the anagram characters alphabetically and in the loop do the same for each dictionary word. if the anagram is a substring of the dictionary word then its an anagram
-
Jeff Puckett over 7 yearsHere's a one-line answer to this question stackoverflow.com/a/32156857/4233593
-
-
khiemnn almost 12 yearsI'm sorry i think i put you guys in the middle of the trouble. From the start, there is a form for the users to input an arbitrary word, that explains why there is a $_POST there. @andrewsi I think your pseudo code has something wrong isn't it? Because you have to split the string user inputted, then join them back to compare, because maybe in the $dict just got something 1 letter only, like "a", "e",etc...
-
andrewsi almost 12 yearsWhy do you need to join the string back together to compare them? The logic above will split both the search word and the dictionary words into arrays, and compare the contents of each array; it doesn't matter if the dictionary words is one letter - you'll end up with an array that just has one item in it.
-
khiemnn almost 12 yearsI have to split because of this : for example, my dictionary above contains 'hat' and 'e', and the string user inputs is 'hatedes'. The main goal is to print out the anagram matched with the dict, so this time it will print out 'hat' 'e' and 'des' because the dict contains it. If you compare the contents of each array, how if the array the user inputs is longer than the array of the dictionary?
-
khiemnn almost 12 yearsThanks gunnx, but i have this to wonder about. For example my dictionary have the word 'hat', then you sort it, it become 'aht', the string the user inputs is 'ath'. So if you sort both of them, they match! But have a look, the word the user inputted does not match the dict (ath and hat).
-
andrewsi almost 12 yearsIt doesn't matter how long the arrays are; you just need to check to make sure that the contents of the one match the contents of the other. So if the dictionary word is 'a', and the word entered is 'aardvark', you'd match because the a is there. It doesn't matter what the other letters in the word are.
-
andrewsi almost 12 yearsI didn't say anything about sorting - did you mean that for someone else?
-
khiemnn almost 12 yearsI'm pretty sorry I'm all messed up in my head. Your pseudocode seems to be correct. I will try implementing it in the morning because it's late here and I'm fully messed up with both making algorithm and coding :( Thank you indeed !
-
gunnx almost 12 yearsYou sort the input word too, as shown in the code $anagramSorted
-
khiemnn almost 12 yearsIt's wrong with the algorithm if the length of the inputted string is longer than the length of one of the word in the dict. The dict has 'hat', if the user input 'hats' then it will never get an anagram. Still crazy with the algorithm.
-
khiemnn almost 12 yearsIf you sort both the inputted string and the word in the dict, It's totally changed! Like my example above I can give you more : the dict has 'good', the user inputs 'doog', If you sort both, they totally match. But the inputted string does not match and it's not in the dict.
-
khiemnn almost 12 yearsIt's like the inputted string is the substring of a word in the dict, and also if a word in the dict is a substring of the inputted string, the part that exactly match the word (of the inputted string) is printed out as an anagram. I still have no algorithm right and no code was implemented, the algorithm you mentioned above is only right when the inputted string's length is lower than the length of the word in the dict.
-
andrewsi almost 12 yearsIn that case, you just need to swap around the comparisons, so you're looking for letters in the input string that are also in dict, instead of the other way round.
-
khiemnn almost 12 yearsI don't get it yet. Do you have some sample or code that are easy for me to understand?
-
andrewsi almost 12 yearsI've added a separate answer with some code in it. It's a different way of working out anagrams than I mentioned above, but the code works well.
-
khiemnn almost 12 yearsCan I respectfully ask you to add comment for the code above? I don't know what the function factorise does.
-
andrewsi almost 12 yearsActually, I deliberately left the comments out; it's not that complicated a piece of code, so you should be able to figure out what it's doing. Try adding lots of
var_dump
calls to see what variables are being set, and take it from there. -
gunnx almost 12 yearssorry not sure what you mean but I see you've accepted an answer so hope it works for you.
-
josephtikva1 almost 12 yearsSome of us are not looking to implement this, but would still like to understand how this works. Please post comments for our sake...
-
mujaffars over 8 years@andrewsi I am trying for long word -- Where the script is failing. eg. for $searchWord = "hastededrs"; Result are not displaying. :(
-
andrewsi over 8 years@mujaffars - it shouldn't have any problems with something that short. All I can suggest is that you add plenty of debugging to your code, and try to figure out why it's failing. Try adding an
else
to theif
- does that show anything, or are you getting a completely blank screen? -
mujaffars over 8 years@andrewsi I am getting completely blank screen. There is not have any PHP error. I have
ini_set('display_errors', '1');
on top. Also I have puttedecho 'end of script';
at the end. Which is displaying only and no any other words. Though if I provide small word then it is working fine. -
andrewsi over 8 years@mujaffars - the only thing I can think of is that maybe the result of the multiplication is large enough that PHP can't cope with it; I've not seen that issue in PHP directly, but if it works when you feed in an 8 letter word and fails when you do nothing more than add an extra letter, I can't think what else it could be
-
andrewsi over 8 years@mujaffars - I can't think off hand of a way to fix it if that is the problem; I guess you'll have to try a different way to do things :s
-
Amrit Shrestha over 3 yearsfor me, the modules didn't work. had to compare directly
if ($searchWordFactor == $dictWordFactor)