Optimize Oracle SQL with large 'IN' clause

10,336

Solution 1

This looks like the right way in Java: http://knol.google.com/k/oracle-passing-a-list-as-bind-variable#

It is similar to the C# solution. Your list of value stays in memory (no temporary table) and it will not be persisted to disk and you use a parameterized query so the query executer doesn't have to reparse every query. I have no tried it with java but I think that it will be fast.

Solution 2

  1. Create an index that covers 'field' and 'value'.

  2. Place those IN values in a temp table and join on it.

Solution 3

SELECT field
FROM table
WHERE value IN SELECT somevalue from sometable

As far as i know, you will face another problem. That will be the limitation of 'IN' clause. Using this, you can avoid that and hopefully fasten your query

Solution 4

You can join a normal table with a memory table that is filled with the list of values.

I don't how to do that with Java exactly but I do know how to do this with C#. I think something similar should be possible with Java.

Read here: http://forums.oracle.com/forums/thread.jspa?threadID=892457&tstart=375

Let's use a collection of User Defined Types (UDT's). First create a table with 1 million rows:

create table employees (id number(10) not null primary key, name varchar2(100) );

insert into employees 
select level l, 'MyName'||to_char(level) 
from dual connect by level <= 1e6;

1000000 rows created

commit;

exec dbms_stats.gather_schema_stats(USER, cascade=>TRUE);

No we turn to the C# code:

Let's select employees with id's 3 and 4.

Collection type MDSYS.SDO_ELEM_INFO_ARRAY is used because if we use this already predefined Oracle type we don't have to define our own Oracle type. You can fill collection MDSYS.SDO_ELEM_INFO_ARRAY with max 1048576 numbers.

using Oracle.DataAccess.Client;
using Oracle.DataAccess.Types;

    [OracleCustomTypeMappingAttribute("MDSYS.SDO_ELEM_INFO_ARRAY")]
    public class NumberArrayFactory : IOracleArrayTypeFactory
    {
      public Array CreateArray(int numElems)
      {
        return new Decimal[numElems];
      }

      public Array CreateStatusArray(int numElems)
      {
        return null;
      }
    }


    private void Test()
    {
      OracleConnectionStringBuilder b = new OracleConnectionStringBuilder();
      b.UserID = "sna";
      b.Password = "sna";
      b.DataSource = "ora11";
      using (OracleConnection conn = new OracleConnection(b.ToString()))
      {
        conn.Open();
        using (OracleCommand comm = conn.CreateCommand())
        {
          comm.CommandText =
              @" select  /*+ cardinality(tab 10) */ *  " +
              @" from employees, table(:1) tab " +
              @" where employees.id = tab.column_value";

          OracleParameter p = new OracleParameter();
          p.OracleDbType = OracleDbType.Array;
          p.Direction = ParameterDirection.Input;
          p.UdtTypeName = "MDSYS.SDO_ELEM_INFO_ARRAY";
          p.Value = new Decimal[] { 3, 4 };

          comm.Parameters.Add(p);

          int numPersons = 0;
          using (OracleDataReader reader = comm.ExecuteReader())
          {
            while (reader.Read())
            {
              MessageBox.Show("Name " + reader[1].ToString());
              numPersons++;
            }
          }
          conn.Close();
        }
      }
    }

The index on employees.id isn't used when one omits hint /*+ cardinality(tab 10) */. This index is created by Oracle because id is the primary key column.

This means that you don't have to fill a temporary table. The list of vaues stays in ram and you join your table employees with this list of values in memory table(:1) tab.

(wateenmooiedag=TTT)

Share:
10,336
NeoNosliw
Author by

NeoNosliw

Updated on June 04, 2022

Comments

  • NeoNosliw
    NeoNosliw almost 2 years

    Here I have a query like below:

    SELECT field
    FROM table
    WHERE value IN ('val1', 'val2', 'val3', ... 'valn')
    

    Let's say there are 2000 values inside the IN clause, the value doesn't exist in other table. Do you have any idea to speed up this operation?

    The question is open to accept any kind of methods..

    Thanks!

  • StuartLC
    StuartLC over 13 years
    Agree with 1. Do you have any info or links with performance stats for 2 - seems a bit non-intuitive? Any guidance as to the tilting point as to IN becomes inefficient vs insert to temp and then JOIN to temp? Thnx
  • NeoNosliw
    NeoNosliw over 13 years
    That is what I answer during the interview. The interviewer then comment this answer must have something to do afterward. I guess he is talking about concurrent access to table and housekeeping.
  • Nick Pierpoint
    Nick Pierpoint over 13 years
    Perhaps you could include an example?
  • lubosdz
    lubosdz over 8 years
    The link above is dead.