Extract SubString based on Regular Expression Match

35,337

You have anchors in place that prevents your pattern from matching

^\(S[0-9]{4}-[0-9]{6}\)$
^                      ^

^ is matching the start of the string

$ is matching the end of the string

and since there is other stuff before and after the part you want to match, your pattern will not match. Just remove those anchors and it should be fine.

Or use word boundaries instead

\bS[0-9]{4}-[0-9]{6}\b

\b will match if there is a "non-word" character (non a letter or a digit) before and after your pattern.

Share:
35,337
HeavenCore
Author by

HeavenCore

Software Engineer & Database Administrator. Cohens Chemist Group (UK) HeavenCore - Personal Site Cohens Chemist - Work Site SOreadytohelp

Updated on August 02, 2022

Comments

  • HeavenCore
    HeavenCore almost 2 years

    Quick RegExp problem (i hope).

    I need to identify a sub string from any string based on a regular expression.

    For Example, take the following strings:

    "Blogs, Joe (S0003-000292).html"
    "bla bla bla S0003-000292 & so on"
    "RE: S0003-000292"
    

    I need to extract the 'S0003-000292' portion (or flag exception if not found).

    As for what i have tried, well, i've written a rough pattern to identify S0000-000000:

    ^\(S[0-9]{4}-[0-9]{6}\)$
    

    And i have tried testing for it as follows:

    Dim regex As New Regex("Blogs, Joe (S0003-000292) Lorem Ipsum!")
    Dim match As Match = regex.Match("^S[0-9]{4}-[0-9]{6}$")
    
    If match.Success Then
        console.writeline "Found: " & match.Value
    Else
        console.writeline "Not Found"
    End If
    

    However, this always results in Not Found.

    So, 2 questions really, what is wrong with my pattern & how can I use a revised pattern to extract the sub string?

    (Working with .net 2)

    EDIT: stema pointed me in the right direction (i.e. to drop the ^ and $) - however that did not solve the problem, my main problem was that i had defined the string in the RegEx contructor instead of the pattern - swapped these over and it worked fine (i blame lack of caffine):

    Dim regex As New Regex("S[0-9]{4}-[0-9]{6}")
    Dim match As Match = regex.Match("Joe, Blogs (S0003-000292).html")
    
    If match.Success = True Then
        console.writeline "Found: " & match.Value
    Else
        console.writeline "Not Found"
    End If
    
  • Red
    Red about 12 years
    Dim reg as new Regex("(.)*S[0-9]{4}-[0-9]{6}(.)*") Dim str as new string("Blogs, Joe (S0003-000292) Lorem Ipsum!") MessageBox.show(reg.IsMatch(str)) I am not sure about syntax but this may be a right conversion of my c# code.
  • stema
    stema about 12 years
    You are able to edit your own answer, no need to post additions as comment or as second answer. So, please edit your first answer and delete this one.
  • HeavenCore
    HeavenCore about 12 years
    +1 Cheers Stema - that set me in the right direction - See Edit on my question.
  • stema
    stema about 12 years
    @HeavenCore shame on me, I should have noticed that "Blogs, Joe (S0003-000292) Lorem Ipsum!" is not a regex!
  • Red
    Red about 12 years
    As I am new i am not able to find delete option :(
  • stema
    stema about 12 years
    You can delete you own post. Between your post and the comments to that post there are some buttons: link, edit and the third one is delete. Then press OK on the "Vote to delete ..." popup, because its yours, only your vote is needed.
  • stema
    stema about 12 years
    OK, see this answer on meta you have to register your account to be able to delete your own stuff. Currently you are an unregistered user (means stackoverflow knows you only because of the cookies on your system), so please register your account