Pattern Matching with Like Clause

11,303

Solution 1

Try this:

SELECT c1
FROM   t1
WHERE  substr(c1,1,1) IN ('A','B','C','D',
                          'E','F','G','H',
                          'I','J','K','L',
                          'M','N','O','P',
                          'Q','R','S','T',
                          'U','V','W','X',
                          'Y','Z')
       AND substr(c1,2,1) IN ('A','B','C','D',
                              'E','F','G','H',
                              'I','J','K','L',
                              'M','N','O','P',
                              'Q','R','S','T',
                              'U','V','W','X',
                              'Y','Z')
       AND substr(c1,3,1) IN ('1','2','3','4')
/

If you want also to match lower letters, then apply the upper() function to the 2 first substr(): eg. where upper(substr(c1,1,1)) in ...

Performance

I tested the Performance of the query with regexp_like and without. As you can see the query without the regexp_like function was 100% faster. (Note. Both querys did a soft-parse)

SQL> select count(*) from t1;                                                                                                                     

  COUNT(*)
----------
    458752

Elapsed: 00:00:00.02
SQL> set timing off;                                                                                                                                  
SQL> select count(*) from t1;                                                                                                                     

  COUNT(*)
----------
    458752

SQL> set timing on;                                                                                                                                   
SQL> select count(*) from t1 where regexp_like(c1, '[A-Z][A-Z][1-4].*');                                                                       

  COUNT(*)
----------
     65536

Elapsed: 00:00:02.66
SELECT count(*)
FROM   t1
WHERE  substr(c1,1,1) IN ('A','B','C','D',
                          'E','F','G','H',
                          'I','J','K','L',
                          'M','N','O','P',
                          'Q','R','S','T',
                          'U','V','W','X',
                          'Y','Z')
       AND substr(c1,2,1) IN ('A','B','C','D',
                              'E','F','G','H',
                              'I','J','K','L',
                              'M','N','O','P',
                              'Q','R','S','T',
                              'U','V','W','X',
                              'Y','Z')
       AND substr(c1,3,1) IN ('1','2','3','4')
 18  /                                                                                                                                                

  COUNT(*)
----------
     65536

Elapsed: 00:00:01.15
SQL> 

Second Method

Get the ascii values of A,Z,1 and 4

SQL> select ascii('A') from dual;                                                                                                                     

ASCII('A')
----------
        65


SQL> select ascii('Z') from dual;                                                                                                                     

ASCII('Z')
----------
        90


SQL> select ascii('1') from dual;                                                                                                                     

ASCII('1')
----------
        49

SQL> select ascii('4') from dual;                                                                                                                     

ASCII('4')
----------
        52

Now you can write your statement a lot shorter

SELECT count(* )
FROM   t1
WHERE  ascii(substr(c1,1,1)) BETWEEN 65 AND 90
       AND ascii(substr(c1,2,1)) BETWEEN 65 AND 90
       AND ascii(substr(c1,3,1)) BETWEEN 49 AND 52
/

Solution 2

Clunky, but perhaps:

select *
  from <your_table>
 where TRANSLATE(SUBSTR(<blah>,1,3),'ABCDEFGHIJKLMNOPQRSTUVWXYZ1234',
                                    'AAAAAAAAAAAAAAAAAAAAAAAAAA1111') = 'AA1';

might suit your needs....

EDIT: Incorporated xlnt suggestion by @Hobo to translate the substring, rather than taking the substring of the translated string...

Solution 3

Total wild card but will suggest it anyway.

Does your 9i Database installation have the PL/SQL Web Toolkit installed? The reason for asking is that one of our customers pointed out that there is limited regular expression support using one of the supplied packages that comes with it.

The package is called owa_pattern and the only 9i link I could find is this one

I have never used it and am still trying to get to grips with Regular Expressions so can't tell you if it would fit your purpose but thought that it may be of use.

Solution 4

If you really want to use reg exps you can develop a java stored proc and a accompanying pl/sql wrapper. (I assume that the Java release supported in Oracle 9 supports reg exps, I am not 100% sure). You can call that java stored proc via the pl/sql wrapper in your select statement.

But more easy and faster:

SELECT c1
FROM   t1
WHERE  substr(c1,1,1) between  'A' and 'Z'
AND    substr(c1,2,1) between  'A' and 'Z'
AND    substr(c1,3,1) IN ('1','2','3','4')

A variant of zürigschnäzlets 's solution without use of the ascci function.

Solution 5

I recommend using INSTR:

INSTR(t.column, '1', 3, 1) > 0 OR 
INSTR(t.column, '2', 3, 1) > 0 OR
INSTR(t.column, '3', 3, 1) > 0 OR
INSTR(t.column, '4', 3, 1) > 0

The problem with using a wildcard in a LIKE is there's no control over where in column value the 1/2/3/4/etc is going to turn up - it could be at the end.

DCookie is right - this answer doesn't have a way of handling if there's numeric data in that place. But it's still better than using LIKE with wildcards.

Share:
11,303
Admin
Author by

Admin

Updated on July 26, 2022

Comments

  • Admin
    Admin almost 2 years

    I am attempting to use a LIKE clause in a SQL statement to match a certain pattern within Oracle.

    I would like to do something as such:

    LIKE '[A-Z][A-Z][1-4]%'
    

    ..but I can't use a regex because this is on Oracle9i (regex support came in 10g).

    I am attempting to match something that has two characters before it, then a number between 1 and 4 and that whatever beyond that. I have attempted this, but it doesn't seem to work. The only way I have been able to get it to work is by doing:

    WHERE ...
    LIKE '%1__' OR 
    LIKE '%2__' OR 
    LIKE '%3__' OR 
    LIKE '%4__'
    

    I am not sure if the way I would like to do it is possible or the correct way as I have never attempted patterns with the LIKE clause.

    Any help you could give would be greatly appreciated.