Regex difference: (\w+)? and (\w*)

98,490

(\w+)? and (\w*) both match the same (0..+inf word characters)

However, there is a slight difference:

In the first case, if this part of the regex matches "", the capturing group is absent. In the second case, it is empty. In some languages, the former manifests as a null while the latter should always be "".

In Javascript, for example,

/(\w*)/.exec("")  // ["", ""]
/(\w+)?/.exec("") // ["", undefined]

In PHP (preg_match), in the former case, the corresponding key is simply absent in the matches array: http://3v4l.org/DB6p3#v430

Share:
98,490

Related videos on Youtube

爱国者
Author by

爱国者

Software developer; Having been programming in Java and Scala for nearly 5 years. Graduated with Bachelor in Information & Computation Science from Guangdong University of Technology I find Stackoverflow as a great place to give something back to software community. I am interested in Scala, Java, Maven, Spring, Datanucleus, Android, Scalatra, nodejs, Play, Ruby, Python, Erlang, but Scala is my favorite programming language. If you are a Chinese Scala geek, welcome to join us into QQ group : 132569382.

Updated on July 09, 2022

Comments

  • 爱国者
    爱国者 almost 2 years

    Is there any difference between (\w+)? and (\w*) in regex?

    It seems the same, doesn't it?

    • John Dvorak
      John Dvorak over 11 years
      It does seem the same, except if you care about "" vs. null
    • Rohit Jain
      Rohit Jain over 11 years
      (\w+)? seems odd. Where did you see that? Any link to external resource please?
    • 爱国者
      爱国者 over 11 years
      I saw (\w+)? in my company project
  • Cozzamara
    Cozzamara over 11 years
    In which language the capturing of "" results in null or empty string ?
  • John Dvorak
    John Dvorak over 11 years
    @Cozzamara In the first case, an empty match is not captured.
  • Cozzamara
    Cozzamara over 11 years
    By which engine ? Both Perl and SED do capture empty string by both patterns
  • Bergi
    Bergi over 11 years
    Thanks, I never realized that non-matched groups result in undefined instead of empty strings (since I always only check for truthiness)