How do I check if a string contains a specific word?

6,007,026

Solution 1

Now with PHP 8 you can do this using str_contains:

if (str_contains('How are you', 'are')) { 
    echo 'true';
}

RFC

Before PHP 8

You can use the strpos() function which is used to find the occurrence of one string inside another one:

$a = 'How are you?';

if (strpos($a, 'are') !== false) {
    echo 'true';
}

Note that the use of !== false is deliberate (neither != false nor === true will return the desired result); strpos() returns either the offset at which the needle string begins in the haystack string, or the boolean false if the needle isn't found. Since 0 is a valid offset and 0 is "falsey", we can't use simpler constructs like !strpos($a, 'are').

Solution 2

You could use regular expressions as it's better for word matching compared to strpos, as mentioned by other users. A strpos check for are will also return true for strings such as: fare, care, stare, etc. These unintended matches can simply be avoided in regular expression by using word boundaries.

A simple match for are could look something like this:

$a = 'How are you?';

if (preg_match('/\bare\b/', $a)) {
    echo 'true';
}

On the performance side, strpos is about three times faster. When I did one million compares at once, it took preg_match 1.5 seconds to finish and for strpos it took 0.5 seconds.

Edit: In order to search any part of the string, not just word by word, I would recommend using a regular expression like

$a = 'How are you?';
$search = 'are y';
if(preg_match("/{$search}/i", $a)) {
    echo 'true';
}

The i at the end of regular expression changes regular expression to be case-insensitive, if you do not want that, you can leave it out.

Now, this can be quite problematic in some cases as the $search string isn't sanitized in any way, I mean, it might not pass the check in some cases as if $search is a user input they can add some string that might behave like some different regular expression...

Also, here's a great tool for testing and seeing explanations of various regular expressions Regex101

To combine both sets of functionality into a single multi-purpose function (including with selectable case sensitivity), you could use something like this:

function FindString($needle,$haystack,$i,$word)
{   // $i should be "" or "i" for case insensitive
    if (strtoupper($word)=="W")
    {   // if $word is "W" then word search instead of string in string search.
        if (preg_match("/\b{$needle}\b/{$i}", $haystack)) 
        {
            return true;
        }
    }
    else
    {
        if(preg_match("/{$needle}/{$i}", $haystack)) 
        {
            return true;
        }
    }
    return false;
    // Put quotes around true and false above to return them as strings instead of as bools/ints.
}

One more thing to take in mind, is that \b will not work in different languages other than english.

The explanation for this and the solution is taken from here:

\b represents the beginning or end of a word (Word Boundary). This regex would match apple in an apple pie, but wouldn’t match apple in pineapple, applecarts or bakeapples.

How about “café”? How can we extract the word “café” in regex? Actually, \bcafé\b wouldn’t work. Why? Because “café” contains non-ASCII character: é. \b can’t be simply used with Unicode such as समुद्र, 감사, месяц and 😉 .

When you want to extract Unicode characters, you should directly define characters which represent word boundaries.

The answer: (?<=[\s,.:;"']|^)UNICODE_WORD(?=[\s,.:;"']|$)

So in order to use the answer in PHP, you can use this function:

function contains($str, array $arr) {
    // Works in Hebrew and any other unicode characters
    // Thanks https://medium.com/@shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
    // Thanks https://www.phpliveregex.com/
    if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
}

And if you want to search for array of words, you can use this:

function arrayContainsWord($str, array $arr)
{
    foreach ($arr as $word) {
        // Works in Hebrew and any other unicode characters
        // Thanks https://medium.com/@shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
        // Thanks https://www.phpliveregex.com/
        if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
    }
    return false;
}

As of PHP 8.0.0 you can now use str_contains

<?php
    if (str_contains('abc', '')) {
        echo "Checking the existence of the empty string will always 
        return true";
    }

Solution 3

Here is a little utility function that is useful in situations like this

// returns true if $needle is a substring of $haystack
function contains($needle, $haystack)
{
    return strpos($haystack, $needle) !== false;
}

Solution 4

To determine whether a string contains another string you can use the PHP function strpos().

int strpos ( string $haystack , mixed $needle [, int $offset = 0 ] )`
<?php

$haystack = 'how are you';
$needle = 'are';

if (strpos($haystack,$needle) !== false) {
    echo "$haystack contains $needle";
}

?>

CAUTION:

If the needle you are searching for is at the beginning of the haystack it will return position 0, if you do a == compare that will not work, you will need to do a ===

A == sign is a comparison and tests whether the variable / expression / constant to the left has the same value as the variable / expression / constant to the right.

A === sign is a comparison to see whether two variables / expresions / constants are equal AND have the same type - i.e. both are strings or both are integers.

Solution 5

While most of these answers will tell you if a substring appears in your string, that's usually not what you want if you're looking for a particular word, and not a substring.

What's the difference? Substrings can appear within other words:

  • The "are" at the beginning of "area"
  • The "are" at the end of "hare"
  • The "are" in the middle of "fares"

One way to mitigate this would be to use a regular expression coupled with word boundaries (\b):

function containsWord($str, $word)
{
    return !!preg_match('#\\b' . preg_quote($word, '#') . '\\b#i', $str);
}

This method doesn't have the same false positives noted above, but it does have some edge cases of its own. Word boundaries match on non-word characters (\W), which are going to be anything that isn't a-z, A-Z, 0-9, or _. That means digits and underscores are going to be counted as word characters and scenarios like this will fail:

  • The "are" in "What _are_ you thinking?"
  • The "are" in "lol u dunno wut those are4?"

If you want anything more accurate than this, you'll have to start doing English language syntax parsing, and that's a pretty big can of worms (and assumes proper use of syntax, anyway, which isn't always a given).

Share:
6,007,026
Charles Yeung
Author by

Charles Yeung

Updated on July 08, 2022

Comments

  • Charles Yeung
    Charles Yeung almost 2 years

    Consider:

    $a = 'How are you?';
    
    if ($a contains 'are')
        echo 'true';
    

    Suppose I have the code above, what is the correct way to write the statement if ($a contains 'are')?

  • jwueller
    jwueller over 13 years
    I doubt it. The docs state int preg_match ( string $pattern , string $subject [, array &$matches [, int $flags = 0 [, int $offset = 0 ]]] ).
  • Breezer
    Breezer over 13 years
    I would argue that i find it to be the total opposite, it's bad to use it for complicated operation if an alternative is present, but for really simple word matching it's great you can set different delimiters to make it case insensitive and what not
  • Breezer
    Breezer over 13 years
    @Alexander.Plutov second of all you're giving me a -1 and not the question ? cmon it takes 2 seconds to google the answer google.com/…
  • SamGoody
    SamGoody over 12 years
    +1 Its a horrible way to search for a simple string, but many visitors to SO are looking for any way to search for any of their own substrings, and it is helpful that the suggestion has been brought up. Even the OP might have oversimplified - let him know of his alternatives.
  • AKS
    AKS over 11 years
    strstr() returns FALSE if the needle was not found. So a strlen is not necessary.
  • jsherk
    jsherk over 11 years
    @DTest - well yes of course it will return true because the string contains 'are'. If you are looking specifically for the word ARE then you would need to do more checks like, for example, check if there is a character or a space before the A and after the E.
  • Melsi
    Melsi over 11 years
    Very good comments above! I never use != or ==, after all !== and === is best option (in my opinion) all aspect considered (speed, accuracy etc).
  • Giulio Muscarello
    Giulio Muscarello over 11 years
    @jsherk Why not regexes, then? Something like " are ".
  • erdomester
    erdomester over 11 years
    best way is if ((strpos($form_email,'@') === false) || (strpos($form_email,'.') === false)) { $error = 'Invalid email<br>'; }
  • James P.
    James P. almost 11 years
    Might be in it's place it a utility class though and would improve readability.
  • Xaqq
    Xaqq almost 11 years
    @RobinvanBaalen Actually, it can improves code readability. Also, downvotes are supposed to be for (very) bad answers, not for "neutral" ones.
  • Robin van Baalen
    Robin van Baalen almost 11 years
    @Xaqq In my opinion, it's very bad to write functions that don't actually do anything else than improve readability. Therefore the -1.
  • Martijn
    Martijn almost 11 years
    @Guiulio: Regex is slower :) If you only need to check if it exists, in any way ( so 'are' or 'care' doesnt matter), then use basic functions for better performance
  • Brandin
    Brandin almost 11 years
    @RobinvanBaalen functions are nearly by definition for readability (to communicate the idea of what you're doing). Compare which is more readable: if ($email->contains("@") && $email->endsWith(".com)) { ... or if (strpos($email, "@") !== false && substr($email, -strlen(".com")) == ".com") { ...
  • James P.
    James P. almost 11 years
    @RobinvanBaalen "it's very bad to write functions that don't actually do anything else than improve readability" It depends. This is not the best of examples but if you're maintaining someone else's code it can be helpful.
  • Admin
    Admin over 10 years
    Technically, the question asks how to find words not a substring. This actually helped me as I can use this with regex word boundries. Alternatives are always useful.
  • Robin van Baalen
    Robin van Baalen over 10 years
    @JamesPoulson don't forget to quote the 'in my opinion' part please. It is after all just my opinion and I'm never suggesting that it's a golden rule. That being said, you are completely right that there might be some useful cases for this. But even then, in my opinion, it's still bad practice.
  • James P.
    James P. over 10 years
    @RobinvanBaalen in the end rules are meant to be broken. Otherwise people wouldn't come up with newer inventive ways of doing things :) . Plus have to admit I have trouble wrapping the mind around stuff like on martinfowler.com. Guess the right thing to do is to try things out yourself and find out what approaches are the most convenient.
  • sg3s
    sg3s over 10 years
    Could you please tell me why in the world you would use a function like this, when strpos is a perfectly viable solution?...
  • Jason OOO
    Jason OOO over 10 years
    @sg3s: you are totally right, however, strpos also based on something like that, also, I didn't posted it for rep just for sharing a bit of knowledge
  • Wouter
    Wouter over 10 years
    As for not catching 'care' and such things, it is better to check for (strpos(' ' . strtolower($a) . ' ', ' are ') !== false)
  • Avatar
    Avatar over 10 years
    I wanted to check if a string does not contain a word. I tried to change false to true if (strpos($a,'are')!==true) {...} but it does not work. Instead I am using now: if(! (strpos($a,'are')!==false)) { ... } which looks awkward. Anyone?
  • Jo Smo
    Jo Smo over 10 years
    A note on the php.net/manual/en/function.strstr.php page: Note: If you only want to determine if a particular needle occurs within haystack, use the faster and less memory intensive function strpos() instead.
  • meda
    meda about 10 years
    @EchtEinfachTV that function always return false or the postion but never true
  • Tino
    Tino about 10 years
    Another opinion: Having an utility function which you can easily wrap can help debugging. Also it loundens the cry for good optimizers which eliminate such overhead in production services. So all opinions have valid points. ;)
  • Tino
    Tino about 10 years
    @EchtEinfachTV try strpos($a,'are')===false. === is the complementary operator of !==.
  • Adam Merrifield
    Adam Merrifield about 10 years
    This is backwards. The i in stristr stands for insensitive.
  • Vinod Joshi
    Vinod Joshi about 10 years
    Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster.
  • trejder
    trejder about 10 years
    @Melsi What speed got to do with the fact, whether you use compare with (===) or without (==) type checking? Do you suggest, that === is faster that ==? I doubt so...
  • Mr Lister
    Mr Lister about 10 years
    @LegoStormtroopr While the question does ask that, this answer doesn't actually show a solution which checks only for whole words. If would have been a much better answer if it had.
  • equazcion
    equazcion about 10 years
    I tend to avoid this issue by always using strpos($a, 'are') > -1 to test for true. From a debugging perspective, I find my brain wastes fewer clock cycles determining if the line is written correctly when I don't have to count contiguous equals signs.
  • Wayne Whitty
    Wayne Whitty almost 10 years
    @tastro Are there any reputable benchmarks on this?
  • code_monk
    code_monk over 9 years
    this should be the canonical answer. Because we're looking for words and not substrings, regex is appropriate. I'll also add that \b matches two things that \W doesn't, which makes it great for finding words in a string: It matches beginning of string (^) and end of string ($)
  • user610342
    user610342 over 9 years
    For those of us also writing C# this function is a nice addition. Note: it needs to be coded as return (strpos($haystack, $needle) !== false);
  • minipif
    minipif over 9 years
    @Wouter Padding with spaces and searching for " are " is not the solution, as it's not necessarily followed by a space (eg. " You are. ")
  • albanx
    albanx over 9 years
    +1 for the answer and -1 to the @plutov.by comment because , strpos is just a single check meanwhile regexp you can check many words in the same time ex: preg_match(/are|you|not/)
  • Bryan
    Bryan over 9 years
    You never know if the needle comes before or after the haystace. This helper function changes needle to the more common position. +1
  • JoeMoe1984
    JoeMoe1984 over 9 years
    Like many before me have said this improves readability, but lets say later on in your project you realize this function is returning false all over the place. Maybe because a word started with a capital letter when it normally started with a
  • yentsun
    yentsun about 9 years
    Regular Expressions should be the last resort method. Their use in trivial tasks should be discouraged. I insist on this from the height of many years of digging bad code.
  • slownage
    slownage about 9 years
    you probably meant: $found = false at the beginning
  • Bono
    Bono about 9 years
    While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.
  • lightbringer
    lightbringer about 9 years
    your function may not work if the word is linked with comma, question mark or dot. e.g. "what you see is what you get." and you want to determine if "get" is in the sentence. Notice the full stop next to "get". In this case, your function returns false. it is recommended to use regular expression or substr(I think it uses regular expression anyway) to search/replace strings.
  • Decebal
    Decebal about 9 years
    @lightbringer you could not be more wrong with your recommendation, what does it mean for you "it is recommended" ? there is no supreme person that recommends or aproves. It's about the use of regular expression engine in php that is a blackhole in the language itself, you may want to try putting a regex match in a loop and benchmark the results.
  • Sunny
    Sunny about 9 years
    last var_dump is false
  • Sunny
    Sunny about 9 years
    it will fail if $string is Are are, are?
  • Jason OOO
    Jason OOO about 9 years
    @Sunny: it was typo: var_dump(is_str_contain("mystringss", "strings")); //true
  • Cosmin
    Cosmin almost 9 years
    Of course this is usefull. You should encourage this. What happens if in PHP 100 there is a new and faster way to find string locations ? Do you want to change all your places where you call strpos ? Or do you want to change only the contains within the function ??
  • Jasom Dotnet
    Jasom Dotnet over 8 years
    This comment is totally lost in nowhere but anyway: preg_match works when you need to check the result of json request (does it contain error word?) before json_decode. strpos didn't work for me.
  • Shapeshifter
    Shapeshifter over 8 years
    good, but preg_match is risky since it can return false or 0. You should be testing for ===1 in #3
  • xDaizu
    xDaizu over 8 years
    @DTest is kinda right. I don't want to be "that guy" but either this answer is incomplete or the question should be rephrased to not specify it's looking for "words" ^^U
  • T30
    T30 about 8 years
    Crashes if you search the first word.
  • Robert Sinclair
    Robert Sinclair almost 8 years
    this should be the correct answer.. the rest of the answers will find "are" in a string like "do you care".. As mentioned by @Dtest
  • Paul
    Paul almost 8 years
    @RobertSinclair Is that so bad? If you asked me if the string "do you care" contains the word "are" I would say "yes". The word "are" is clearly a substring of that string. That's a separate question from """Is "are" one of the words in the string "do you care"""".
  • Robert Sinclair
    Robert Sinclair almost 8 years
    @Paulpro Eventhough OP didn't specify the $a is a phrase, I'm pretty sure it was implied. So his question was how to detect the Word inside the Phrase. Not if a Word contains a Word inside of it, which I would assume would be irrelevant more often than not.
  • Pathros
    Pathros almost 8 years
    I am getting the following warning: WARNING preg_match(): Delimiter must not be alphanumeric or backslash
  • Pathros
    Pathros almost 8 years
    If $a='Computer hardware', I want it to return false when looking for are. Nevertheless, in this case it returns true. How do you do it to only look for entire words???
  • afarazit
    afarazit over 7 years
    can be simplified to return strpos($str, $character) !== false
  • Michael
    Michael over 7 years
    The regular expression should be '/\bare\b/'. The \b is a marker for word boundary, so this regular expression won't match 'hardware'.
  • simhumileco
    simhumileco over 7 years
    This is not a good answer. If the search string will be at the begin of the searched string, then the function mb_strpos(...) return zero, which evolves into false.
  • LIGHT
    LIGHT about 7 years
    why not simply if (strpos($a, 'are') > -1) {//found}else{//not found}
  • Andrejs Gubars
    Andrejs Gubars about 7 years
    Well... partially true, in php 0 == false is true, but 0 === false is false
  • Spotlight
    Spotlight about 7 years
    Using a regex to do a simple operation is overkill. Use strpos instead
  • Djave
    Djave almost 7 years
    I've landed on this specific answer hundreds of times in my career, and every time I read it, my brain hurts. Seeings as this question has been view 2.5 million times could we possibly change the example to something like $subject = 'How are you?';$query = 'are';if (strpos($subject, $query) !== false).... No worries if not, the fact it has been viewed 2.5 million times is probably also a good reason not to change it. I'd also opt for return true rather than echo 'true' but I understand that is really starting to deviate from the question.
  • Marko
    Marko over 6 years
    Yes, using '===' is faster than '==' because there is no type coercion necessary.
  • Tim Visée
    Tim Visée over 6 years
    Note, that because many frameworks agree this is dumb, most of them have helper functions available. For example, Laravel has str_contains($a, 'are'); as seen here: laravel.com/docs/5.5/helpers#method-str-contains.
  • Ron
    Ron over 6 years
    If I get $1 everytime I visit this page just to copy paste this solution, I could have bought ramen noodles for a whole week
  • Code4R7
    Code4R7 about 6 years
    If you are interested in words rather than bytes, use grapheme_strpos(). Or, if you really can not use Intl, use mb_strpos() instead.
  • Paul Spiegel
    Paul Spiegel about 6 years
    This might be slower, but IMHO strstr($a, 'are') is much more elegant than the ugly strpos($a, 'are') !== false. PHP really needs a str_contains() function.
  • MetalWeirdo
    MetalWeirdo almost 6 years
    @Jimbo it does works, you're just missing the `\` 3v4l.org/ZRpYi
  • mickmackusa
    mickmackusa almost 5 years
    This answer is poorly demonstrated and fails with many extended scenarios. I don't see any benefit in entertaining this technique. Here is the refined custom function and iterated call: 3v4l.org/E9dfD I have no interest in editing this wiki because I find it to be wasteful of researchers time.
  • kurdtpage
    kurdtpage over 4 years
    It blows my mind that this is not the accepted answer
  • Asad Ullah
    Asad Ullah over 4 years
    this strpos() is not working for mw sometimes this answer helps +1
  • Hafenkranich
    Hafenkranich over 4 years
    It returns false for "are you sure?" since the position for strpos is 0
  • Jahirul Islam Mamun
    Jahirul Islam Mamun almost 4 years
    If i use "care" its return true as well :(
  • MaXi32
    MaXi32 over 3 years
    was looking for this alternative. really helpful
  • dearsina
    dearsina about 3 years
    Please be wary that str_contains() will return true on partial matches also (ox will match fox). 3v4l.org/ARJqh
  • mickmackusa
    mickmackusa about 3 years
    Simpler? There is never a need to write ? true : false; when the expression returns a boolean value.
  • squarecandy
    squarecandy about 3 years
    If you're still on PHP 7 but want to start using the PHP 8 function, you can use this polyfill: if (!function_exists('str_contains')) { function str_contains($haystack, $needle) { return $needle !== '' && mb_strpos($haystack, $needle) !== false; } } -- taken from this php.net comment; based on Laravel.
  • baptx
    baptx almost 3 years
    So the advantage of str_contains is just that it is easier to read?
  • ToolmakerSteve
    ToolmakerSteve almost 3 years
    @baptx - equally importantly, its easier to not screw up. As the answer describes, slight variations on the test will give the wrong result on "falsey" strings. Because of this, the sane thing to do, is to define a function.
  • stmax
    stmax almost 2 years
    Funnily strpos('foo', 'bar') > -1 gives false, strpos('foo', 'bar') >= 0 gives true. Even funnier php getting a sane str_contains only in version 8.. about time.