Substring-indexing in Oracle

16,883

Solution 1

You're looking for a function based index:

create index ix_substring on TABLE (substr(COLUMN, 4, 9))

Solution 2

If you're using Oracle 11g, you could look at using virtual columns and then indexing those. This is pretty much equivalent to René's answer but might provide a bit more formality around the use of the data so that only the correct characters are used.

Virtual columns won't take up any additional space in the database, although any index you create on that virtual column will.

Share:
16,883
usr-local-ΕΨΗΕΛΩΝ
Author by

usr-local-ΕΨΗΕΛΩΝ

Chuck Norris is the only one who can type my name on a keyboard by using my teeth as keys, courtesy of a roundhouse kick in my mouth

Updated on June 04, 2022

Comments

  • usr-local-ΕΨΗΕΛΩΝ
    usr-local-ΕΨΗΕΛΩΝ almost 2 years

    I just found that our current database design is a little inefficient according to the SELECT queries we perform the most. IBANs are positional coordinates, according to nation-specific formats.

    Since we mostly perform JOINs and WHEREs on a precise substring of IBAN columns in some tables, my question is about assigning an index to the substring(s) of a column

    Are we forced to add redundant and indexed columns to the table? Ie. add columns NATION_CODE, IBAN_CIN, IT_CIN, IT_ABI, IT_CAB, IT_ACCOUNT (where the IT_ fields are considered only for accounts starting in ITXX) each one with appropriate secondary indexing or is there any special kind of secondary index that can be applied only on a substring of a column?

    The first solution could make the DB more complex since IBAN accounts are used all along the DBMS (and, obviously, I have no full control over design).

    Thank you

    [Edit] Typical query

    SELECT * FROM DELEGATIONS WHERE SUBSTR(IBAN, 6, 5) IN (SELECT ABI FROM BANKS WHERE ANY_CONDITION)
    

    Extracts all payment delegations where the target account belongs to any of the banks that match CONDITION. Should be changed to

    SELECT * FROM DELEGATIONS WHERE SUBSTR(IBAN, 1, 2) = 'IT' AND SUBSTR(IBAN, 6, 5) IN (SELECT ABI FROM BANKS WHERE ANY_CONDITION)
    

    to make sure that BBAN really holds the bank code in digits [6-11]

  • Yahia
    Yahia over 12 years
    The OP writes that the substring are DIFFERENT (positions that is) depending on how the string starts... which would mean you need to add some conditions like DECODE ot CASE WHEN into the index... from what I understand that won't work the way needed...
  • René Nyffenegger
    René Nyffenegger over 12 years
    @Yahia, as far as I can see from the OP's question, there are only fixed substrings (although, of course for NATION_CODE, IBAN_CIN and so forth, which entails creating mutliple fb-indexes). I am ready to stand corrected, if I am wrong on this. If the OP posted a typical query (that needs a bit of performance boost) that could help.
  • usr-local-ΕΨΗΕΛΩΝ
    usr-local-ΕΨΗΕΛΩΝ over 12 years
    Actually, I care more about the fact that such kind of indexing is feasible, rather than your (rightful) observation about CASEs. I'm going to post an example query. The truth behind everything is that our designer (maybe someone whose rules were posted on TDWTF.com) doesn't care about foreign IBANs, so we extract the bank/branch information from every IBAN in the DB, not only from those that start with IT
  • Mike Meyers
    Mike Meyers over 12 years
    Even if the substrings are in different positions given various prefixes or something, this could easily be handled by a function-based index on a user-defined function. I think the only requirement for a user-defined function is that it must be defined as DETERMINISTIC.
  • usr-local-ΕΨΗΕΛΩΝ
    usr-local-ΕΨΗΕΛΩΝ over 12 years
    Thanks a lot, but unfortunately this is not 11g. When I wrote the question I really hoped a mechanism like virtual columns existed :)