Regular expression as delimiter in explode()

11,278

Simply put, you need to use preg_split instead of explode.

While explode will split on constant values, preg_split will split based on a regular expression.

In your case, it would probably be best to split on non-word characters \W+, then manually filter the results for length.

Share:
11,278
Jessie Stalk
Author by

Jessie Stalk

Updated on June 07, 2022

Comments

  • Jessie Stalk
    Jessie Stalk almost 2 years

    So I have a string which I'm turning into an array but I want to separate each word using a regex. I'm matching a whole word using the below function.

    function substr_count_array($haystack, $needle)
    {
         $initial = 0;
         $bits = explode(' ', $haystack);
    
         foreach ($needle as $substring) 
         {
            if (!in_array($substring, $bits))
            {
                continue;
            }
    
            $initial += substr_count($haystack, $substring);
         }
    
         return $initial;
    }
    

    The problem is that it matches the string animal for example but not animals. And if I do a partial match like this:

    function substr_count_array2($haystack, $needle)
    {
         $initial = 0;
    
         foreach ($needle as $substring) 
         {
              $initial += substr_count($haystack, $substring);
         }
    
         return $initial;
    }
    

    It also matches, let's say, a since it's contained withing the word animals and returns 2. How do I explode() using a regular expression as a delimiter so that I may, for example, match every string that has a length of 5-7 characters?

    Explained simpler:

    $animals = array('cat','dog','bird');
    $toString = implode(' ', $animals);
    $data = array('a');
    
    echo substr_count_array($toString, $data);
    

    If I search for a character such as a, it gets through the check and validates as a legit value because a is contained within the first element. But if I match whole words exploded by a space, it omits them if they are not separated by a space. Thus, I need to separate with a regular expression that matches anything AFTER the string that is to be matched.