Regex for matching CSS hex colors

30,398

Solution 1

Since a hex color code may also consist of 3 characters, you can define a mandatory group and an optional group of letters and digits, so the long and elaborate notation would be:

/#([a-f]|[A-F]|[0-9]){3}(([a-f]|[A-F]|[0-9]){3})?\b/

Or if you want a nice and short version, you can say that you want either 1 or 2 groups of 3 alphanumeric characters, and that they should be matched case insensitively (/i).

/#([a-f0-9]{3}){1,2}\b/i

Instead of [a-f0-9] you can also write [[:xdigit:]], if the regex engine supports this posix character class. In this case you can skip the /i at the end, and the whole formula is only two characters more, but arguably more descriptive.

/#([[:xdigit:]]{3}){1,2}\b/

Solution 2

The accepted answer shows you how to do it with regex, because that was your question. But you really don't need to use regex for this. Normally this is how I would do it:

if(ctype_xdigit($color) && strlen($color)==6){
    // yay, it's a hex color!
}

for 100.000 iterations:

Regex solution *: 0.0802619457245 seconds

Xdigit with strlen: 0.0277080535889 seconds

*: hex: ([a-fA-F0-9]{6})

Solution 3

Shorter version of GolezTrol's answer that avoids writing the character set twice:

/#([a-fA-F0-9]{3}){1,2}\b/

Solution 4

Despite this question's age I'd like to ammend the following:

^#([[:xdigit:]]{3}){1,2}$, where [[:xdigit:]] is a shorthand for [a-fA-F0-9].

So:
<?php preg_match_all("/^#(?>[[:xdigit:]]{3}){1,2}$/", $css, $matches) ?>

Also noteworthy here is the usage of a non-capturing group (?>...), to ensure we don't store data in memory we never wanted to store in the first place.

Try it online

Solution 5

I'm not entirely sure if I got this right, but if you only want to match hex colors at the end of a CSS line:

preg_match_all('/#(?:[0-9a-fA-F]{6}|[0-9a-fA-F]{3})[\s;]*\n/',$css,$matches);

should work, all I did was add the optional \s; char group (optional semi-colon and spaces) and a line-break character (not optional) and it seemed to work.
And as @GolezTrol pointed out #FFF; is valid, too.

When tested on this:

$css = '/* Do not match me: #abcdefgh; I am longer than needed. */
.foo
{
    color: #CAB;
    background-color:#ababab;
}';
preg_match_all('/#(?:[0-9a-fA-F]{6}|[0-9a-fA-F]{3})[\s;]*\n/',$css,$matches);
var_dump($matches);

The output was:

array (array('#CAB;','#ababab;'))
Share:
30,398
Hemaulo
Author by

Hemaulo

please delete me

Updated on April 30, 2021

Comments

  • Hemaulo
    Hemaulo about 3 years

    I'm trying to write regex that extracts all hex colors from CSS code.

    This is what I have now:

    Code:

    $css = <<<CSS
    
    /* Do not match me: #abcdefgh; I am longer than needed. */
    
    .foo
    {
        color: #cccaaa; background-color:#ababab;
    }
    
    #bar
    {
        background-color:#123456
    }
    CSS;
    
    preg_match_all('/#(?:[0-9a-fA-F]{6})/', $css, $matches);
    

    Output:

    Array
    (
        [0] => Array
            (
                [0] => #abcdef
                [1] => #cccaaa
                [2] => #ababab
                [3] => #123456
            )
    
    )
    

    I don't know how to specify that only those colors are matched which ends with punctuation, whitespace or newline.

  • Hemaulo
    Hemaulo over 11 years
    Thanks, \b is what was needed. Not sure why there is "?" though. Anyway, this works as needed: /#(?:[0-9a-fA-F]{6})\b/ Forgot to mention that 3 char codes is not needed.
  • Asad Saeeduddin
    Asad Saeeduddin over 11 years
    A question mark requires zero or one occurrences of the preceding, making the second captured group optional.
  • Synchro
    Synchro over 10 years
    Those alternations are pointless. Here's a simpler version: /#([a-fA-F0-9]){3}(([a-fA-F0-9]){3})?\b/
  • GolezTrol
    GolezTrol almost 10 years
    @HamZa I reverted your change. The extra explanation was nice, but you also changed the regex itself to a completely different one. If you want to make big changes like that, it's better to supply a separate answer than to completely rebuild the accepted answer (or any other answer, for that matter).
  • HamZa
    HamZa almost 10 years
    @GolezTrol I just stumped on this Q&A when I was searching for a duplicate. This one stood out, I was a bit scared when I saw this regex. I'm sure your regex skills has improved a lot in 2 years but there was a lot of redundant things so I decided to give it a polish. I know that it's a bit rude from my part doing this edit but with more than 2.5K view I really thought the accepted answer should look a bit more elegant. As a quick googler, I tend to scroll to the accepted answer first. Note that I don't make such big changes that often. It's quite rare. Sorry for the interruption.
  • GolezTrol
    GolezTrol almost 10 years
    @HamZa Thanks, no problem. Actually my regex skills haven't improved that much, since I use them sparsely. One reason for that is the poor readability. I'm happy with the one I wrote, because it is very readable even if it is a bit redundant. Just as with 'normal' code, I think that shorter isn't necessarily better. I would have left your version though, if it would have been an addition rather than a complete replacement of my answer.
  • Tushar
    Tushar over 8 years
    You can make it even short by using the i case-insensitive match flag. /#([a-f0-9]{3}){1,2}\b/i
  • nkkollaw
    nkkollaw about 7 years
    Who's going to call this function 100,000 times?
  • nkkollaw
    nkkollaw about 7 years
    Sorry, but these kind of things are crazy. That function will be called what, 5 times at the most in any given PHP file? So we're talking about a fraction of a millisecond?
  • Dan Bray
    Dan Bray over 6 years
    This answer deserves way more upvotes. The codes not just over 3 times faster but it's also shorter and easier to understand.
  • Sachin Sarola
    Sachin Sarola almost 6 years
    but I want to check with # would you please help me
  • Gökhan Mete ERTÜRK
    Gökhan Mete ERTÜRK almost 6 years
    @SachinSarola if that's the case, it's easier to use the regex solution. This is how you can do it without regex: if(ctype_xdigit(substr($color,1)) && strlen(ltrim($color,"#"))==6){ }
  • Sachin Sarola
    Sachin Sarola almost 6 years
    @modu thanks for rply but just putting length condition first we get more faster result if I'm not wrong
  • Gökhan Mete ERTÜRK
    Gökhan Mete ERTÜRK almost 6 years
    @SachinSarola that is right, if the input is not a valid hex code. If it is, then both will give you similar performance scores. But still, the regex solution would be faster than both if you want to check for the #. You can also ltrim the hashtag first, and add a condition(to the one in my answer) for checking difference between strlen is equal to 1. It will perform better than the one at my answer. But regex is still the way to go, unless you will use the trimmed version later in your code(storing etc).
  • evolross
    evolross over 5 years
    On the updated short example, why does it break if I remove the #. It will start matching four and five digits - "FFFa" "000aF", etc. It matches the three characters in the middle, where with the # it must be three or six. BTW, this reg-exp matches "#FFF#FF" which it probably shouldn't
  • GolezTrol
    GolezTrol over 5 years
    @evolross It matches color codes in a string. It will match the #FFF in #FFF#FF, but not the last part. If you want to match the exact string, you could add string boundary 'anchors' to the regex, making it something like ^#([a-f0-9]{3}){1,2}\b$. See https://regex101.com/r/LZJr63/1 for a breakdown.
  • alexey-novikov
    alexey-novikov over 4 years
    Definitely, this answer deserves much more upvotes. Functions are much easier to understand and support in the future then regex.
  • GolezTrol
    GolezTrol over 3 years
    Despite my answer I'm reluctant to use regex normally. But the question was to extract the color codes from a string. That's quite simple with regex, but would require a parser that takes at least a couple of lines if written out in code. This function checks if a string is an exact color code, which can be very useful, but does not answer this question.
  • divHelper11
    divHelper11 about 3 years
    Shouldn't it be 7 characters?