ruby regex .scan

12,774

Solution 1

/.._...._[0-9][0-9][0-9][0-9][0-9][0-9](?:[A-Z][A-Z])?/

You can also use {} to make the regex shorter:

/.{2}_.{4}_[0-9]{6}(?:[A-Z]{2})?/

Explanation: ? makes the preceding pattern optional. () groups expressions together (so ruby knows the ? applies to the two letters). The ?: after the opening ( makes the group non-capturing (capturing groups would change the values yielded by scan).

Solution 2

 /.._...._\d{6}([A-Z]{2})?/

Solution 3

Why not just use split?

"AB_ABCD_123456".split(/_/).join(',')

Handles the cases you listed without modification.

Solution 4

Try this:

text.scan(/\w{2}_\w{4}_\d{6}\w{0,2}/) 
#matches AB_ABCD_123456UK or ab_abcd_123456uk and so on...

or

text.scan(/[A-Z]{2}_[A-Z]{4}_\d{6}[A-Z]{0,2}/) 
# tighter, matches only AB_ABCD_123456UK and similars...
# and not something like ab_aBCd_123456UK or ab_abcd_123456uk and similars...

refer to these urls:

Ruby gsub / regex modifiers?

http://ruby-doc.org/docs/ruby-doc-bundle/Manual/man-1.4/syntax.html#regexp

if you want to learn more about regex.

Share:
12,774
michaelmichael
Author by

michaelmichael

Updated on June 04, 2022

Comments

  • michaelmichael
    michaelmichael almost 2 years

    I'm using Ruby's scan() method to find text in a particular format. I then output it into a string separated by commas. The text I'm trying to find would look like this:

    AB_ABCD_123456

    Here's the what I've come up with so far to find the above. It works fine:

    text.scan(/.._...._[0-9][0-9][0-9][0-9][0-9][0-9]/)
    puts text.uniq.sort.join(', ')
    

    Now I need a regex that will find the above with or without a two-letter country designation at the end. For example, I would like to be able to find all three of the below:

    AB_ABCD_123456
    AB_ABCD_123456UK
    AB_ABCD_123456DE

    I know I could use two or three different scans to achieve my result, but I'm wondering if there's a way to get all three with one regex.

  • sepp2k
    sepp2k over 14 years
    If you don't make the group non-capturing scan will only yield the country-codes (or nil for the strings that didn't include one), not the entire string that was matched.
  • Robert K
    Robert K over 14 years
    AFAIK, the OP is trying to find a list of these codes ... not work with just one.
  • michaelmichael
    michaelmichael over 14 years
    i like that second regex example. thanks for the links. i've gone through them, though not as thoroughly as i should. real life problems help my understanding a lot.
  • ezpz
    ezpz over 14 years
    Yes; I saw the example and jumped past the details - a terrible habit. Sorry for the confusion.