Find a number with Oracle regexp that has an exact length of 3

31,234

Solution 1

I'm not quite clear on whether you want exact matches or sub-string matches so I've included various options:

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE MY_TABLE (MY_COLUMN) AS
          SELECT '456' FROM DUAL
UNION ALL SELECT '12345678' FROM DUAL
UNION ALL SELECT 'abc 123 def 456' FROM DUAL;

Query 1:

If you only want rows where MY_COLUMN contains exactly a 3-digit number then you can just use your regular expression wrapped in the start-string (^) and end-string ($) anchors:

    SELECT MY_COLUMN
    FROM   MY_TABLE
    WHERE  REGEXP_LIKE( MY_COLUMN, '^[[:digit:]]{3}$' )

Results:

| MY_COLUMN |
|-----------|
|       456 |

Query 2:

Or, if you are using Oracle 11g then you can use the less verbose PERL syntax:

    SELECT MY_COLUMN
    FROM   MY_TABLE
    WHERE  REGEXP_LIKE( MY_COLUMN, '^\d{3}$' )

Results:

| MY_COLUMN |
|-----------|
|       456 |

Query 3:

If you want to extract the first 3-digit number from the column (where it can have surrounding text or more digits), then:

    SELECT MY_COLUMN,
           REGEXP_INSTR( MY_COLUMN, '\d{3}' ), 
           REGEXP_SUBSTR( MY_COLUMN, '\d{3}' )
    FROM   MY_TABLE
    WHERE  REGEXP_LIKE( MY_COLUMN, '\d{3}' )

Results:

|       MY_COLUMN | REGEXP_INSTR(MY_COLUMN,'\D{3}') | REGEXP_SUBSTR(MY_COLUMN,'\D{3}') |
|-----------------|---------------------------------|----------------------------------|
|             456 |                               1 |                              456 |
|        12345678 |                               1 |                              123 |
| abc 123 def 456 |                               5 |                              123 |

Query 4:

If you want to extract the first exactly 3-digit number from the column then:

    SELECT MY_COLUMN,
           REGEXP_SUBSTR( REGEXP_SUBSTR( MY_COLUMN, '(^|\D)\d{3}(\D|$)' ), '\d{3}' ) AS match
    FROM   MY_TABLE
    WHERE  REGEXP_LIKE( MY_COLUMN, '(^|\D)\d{3}(\D|$)' )

Results:

|       MY_COLUMN | MATCH |
|-----------------|-------|
|             456 |   456 |
| abc 123 def 456 |   123 |

Query 5:

If you want to extract all the non-overlapping 3-digit numbers from the column (where it can have surrounding text), then:

    WITH re_counts AS (
      SELECT MY_COLUMN,
             REGEXP_COUNT( MY_COLUMN, '\d{3}' ) AS re_count
      FROM   MY_TABLE
    )
    ,indexes AS (
      SELECT LEVEL AS "index"
      FROM   DUAL
      CONNECT BY LEVEL <= (SELECT MAX( re_count)  FROM re_counts)
    )
    SELECT MY_COLUMN,
           "index", 
           REGEXP_SUBSTR( MY_COLUMN, '\d{3}', 1, "index" )
    FROM   re_counts
           INNER JOIN
           indexes
           ON ("index" <= re_count)
    ORDER BY MY_COLUMN, "index"

Results:

|       MY_COLUMN | INDEX | REGEXP_SUBSTR(MY_COLUMN,'\D{3}',1,"INDEX") |
|-----------------|-------|--------------------------------------------|
|        12345678 |     1 |                                        123 |
|        12345678 |     2 |                                        456 |
|             456 |     1 |                                        456 |
| abc 123 def 456 |     1 |                                        123 |
| abc 123 def 456 |     2 |                                        456 |

Query 6:

If you want to extract all the sub-matches which are exactly 3-digit numbers then:

    WITH re_counts AS (
      SELECT MY_COLUMN,
             REGEXP_COUNT( MY_COLUMN, '(^|\D)\d{3}(\D|$)' ) AS re_count
      FROM   MY_TABLE
    )
    ,indexes AS (
      SELECT LEVEL AS "index"
      FROM   DUAL
      CONNECT BY LEVEL <= (SELECT MAX( re_count)  FROM re_counts)
    )
    SELECT MY_COLUMN,
           "index", 
           REGEXP_SUBSTR( REGEXP_SUBSTR( MY_COLUMN, '(^|\D)\d{3}(\D|$)', 1, "index" ), '\d{3}' ) AS match
    FROM   re_counts
           INNER JOIN
           indexes
           ON ("index" <= re_count)
    ORDER BY MY_COLUMN, "index"

Results:

|       MY_COLUMN | INDEX | MATCH |
|-----------------|-------|-------|
|             456 |     1 |   456 |
| abc 123 def 456 |     1 |   123 |
| abc 123 def 456 |     2 |   456 |

Query 7:

If you want to extract all 3-digit numbers from a column regardless of whether those matches partially overlap or not then:

    WITH positions AS (
      SELECT LEVEL AS pos
      FROM   DUAL
      CONNECT BY LEVEL <= (SELECT MAX( LENGTH( MY_COLUMN ) - 2 )  FROM MY_TABLE )
    )
    SELECT MY_COLUMN,
           pos, 
           SUBSTR( MY_COLUMN, pos, 3 )
    FROM   MY_TABLE
           INNER JOIN
           positions
           ON (pos <= LENGTH( MY_COLUMN ) - 2 )
    WHERE  REGEXP_LIKE( SUBSTR( MY_COLUMN, pos, 3 ), '^\d{3}$' )
    ORDER BY MY_COLUMN, pos

Results:

|       MY_COLUMN | POS | SUBSTR(MY_COLUMN,POS,3) |
|-----------------|-----|-------------------------|
|        12345678 |   1 |                     123 |
|        12345678 |   2 |                     234 |
|        12345678 |   3 |                     345 |
|        12345678 |   4 |                     456 |
|        12345678 |   5 |                     567 |
|        12345678 |   6 |                     678 |
|             456 |   1 |                     456 |
| abc 123 def 456 |   5 |                     123 |
| abc 123 def 456 |  13 |                     456 |

Solution 2

You're sub-stringing the number... if you're searching for a number that is exactly 3 digits then use LENGTH():

 select * from my_table where length(my_column) = 3

This presupposes that you've stored your numbers in a NUMBER column. If there might be characters in there as well use REGEXP_LIKE() and ensure you tie the regular expression to the beginning (with the ^ operator) and end ($) of the string

select * from my_table where regexp_like( my_column, '^[[:digit:]]{3}$')

You're not checking to see whether your number is at both the beginning and end of the string so you're getting everything with 3 numbers.

Share:
31,234
robert lennon
Author by

robert lennon

gwt gxt java spring maven apache-tomcat STS etc.

Updated on November 25, 2020

Comments

  • robert lennon
    robert lennon over 3 years

    I want to get only 456.

    I tried this regular expression, but I got all numbers which contain 3 digit characters, 456, 46354376, etc.

    select regexp_substr(MY_COLUMN,'([[:digit:]]{3})') from MY_TABLE
    
  • robert lennon
    robert lennon over 10 years
    my db table contains non numeric characters for ex: </br> row1->> im column456test </b> row2->> column4567839asdf </b>row3->> ddffgg987qwert from these rows i want to get from first row 456 and from third row 987 thank to reply
  • Ben
    Ben over 10 years
    So REGEXP_LIKE() then... it's just a little slower that's all.
  • robert lennon
    robert lennon over 10 years
    456 not exact, but also i want to get number that's length equal to 3. in my question i got at least has 3 length of number. but i want to find equal to 3 length that numbre has.
  • Casimir et Hippolyte
    Casimir et Hippolyte over 10 years
    @robert: Yes, I see, refresh your page.
  • robert lennon
    robert lennon over 10 years
    it worked correcly but my numbers left ant right side has double quote your pattern gets "456" ,is there any way to get only 456. thanks @casimire et Hippolyte
  • Casimir et Hippolyte
    Casimir et Hippolyte over 10 years
    @robertlennon: replace all non-digit characters by an empty string.
  • androboy
    androboy about 10 years
    Good one. I have another doubt. How do I get all the digits out of a string. I am trying ----select REGEXP_SUBSTR('somestring-051-a92', '[0-9]+..[0-9]+', 1) ---- This gives me 051-a92.... How do I get 05192 out?
  • MT0
    MT0 about 10 years
    @androboy Replace all the non-digit characters - REGEXP_REPLACE('somestring-051-a92','\D') or, equivalently using POSIX style, REGEXP_REPLACE('somestring-051-a92','[^[:digit:]]')
  • androboy
    androboy about 10 years
    Yeah, but the trouble is redshift is not having regexp replace currently. so I was thinking of doing regexp substr 20 times and append as my string is fixed length 20 characters