Oracle: a query, which counts occurrences of all non alphanumeric characters in a string

12,751

Solution 1

The best option, as you discovered is to use a PL/SQL procedure. I don't think there's any way to create a regex expression that will return multiple counts like you're expecting (at least, not in Oracle).

One way to get around this is to use a recursive query to examine each character individually, which could be used to return a row for each character found. The following example will work for a single row:

with d as (
   select '(1(2)3)' as str_value
   from dual)
select char_value, count(*)
from (select substr(str_value,level,1) as char_value
      from d
      connect by level <= length(str_value))
where regexp_instr(upper(char_value), '[^A-Z,^0-9]'), 1) <> 0
group by char_value;

Solution 2

There is an obscure Oracle TRANSLATE function that will let you do that instead of regexp:

select a.*,
       length(translate(lower(title),'.0123456789abcdefghijklmnopqrstuvwxyz','.')) 
from table_name a

Solution 3

Try this:

SELECT  a.*, LENGTH(REGEXP_REPLACE(TITLE, '[^a-zA-Z0-9]'), '')
FROM    TABLE_NAME a
Share:
12,751
Moz
Author by

Moz

Updated on July 29, 2022

Comments

  • Moz
    Moz almost 2 years

    What would be the best way to count occurrences of all non alphanumeric characters that appear in a string in an Oracle database column.

    When attempting to find a solution I realised I had a query that was unrelated to the problem, but I noticed I could modify it in the hope to solve this problem. I came up with this:

    SELECT  COUNT (*), SUBSTR(TITLE, REGEXP_INSTR(UPPER(TITLE), '[^A-Z,^0-9]'), 1)
    FROM    TABLE_NAME
    WHERE   REGEXP_LIKE(UPPER(TITLE), '[^A-Z,^0-9]')
    GROUP BY    SUBSTR(TITLE, REGEXP_INSTR(UPPER(TITLE), '[^A-Z,^0-9]'), 1)
    ORDER BY COUNT(*) DESC;
    

    This works to find the FIRST non alphanumeric character, but I would like to count the occurrences throughout the entire string, not just the first occurrence. E. g. currently my query analysing "a (string)" would find one open parenthesis, but I need it to find one open parenthesis and one closed parenthesis.