How to optimize an update SQL that runs on a Oracle table with 700M rows

65,246

Solution 1

First of all is it a one-time query or is it a recurrent query ? If you only have to do it once you may want to look into running the query in parallel mode. You will have to scan all rows anyway, you could either divide the workload yourself with ranges of ROWID (do-it-yourself parallelism) or use Oracle built-in features.

Assuming you want to run it frequently and want to optimize this query, the number of rows with the field column as NULL will eventually be small compared to the total number of rows. In that case an index could speed things up. Oracle doesn't index rows that have all indexed columns as NULL so an index on field won't get used by your query (since you want to find all rows where field is NULL).

Either:

  • create an index on (FIELD, 0), the 0 will act as a non-NULL pseudocolumn and all rows will be indexed on the table.
  • create a function-based index on (CASE WHEN field IS NULL THEN 1 END), this will only index the rows that are NULLs (the index would therefore be very compact). In that case you would have to rewrite your query:

    UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1

Edit:

Since this is a one-time scenario, you may want to use the PARALLEL hint:

SQL> EXPLAIN PLAN FOR
  2  UPDATE /*+ PARALLEL(test_table 4)*/ test_table
  3     SET field=0
  4   WHERE field IS NULL;

Explained

SQL> select * from table( dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 4026746538
--------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |            | 22793 |   289K|    12   (9)| 00:00:
|   1 |  UPDATE               | TEST_TABLE |       |       |            |
|   2 |   PX COORDINATOR      |            |       |       |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000   | 22793 |   289K|    12   (9)| 00:00:
|   4 |     PX BLOCK ITERATOR |            | 22793 |   289K|    12   (9)| 00:00:
|*  5 |      TABLE ACCESS FULL| TEST_TABLE | 22793 |   289K|    12   (9)| 00:00:
--------------------------------------------------------------------------------

Solution 2

Are other users are updating the same rows in the table at the same time ?

If so, you could be hitting lots of concurrency issues (waiting for locks) and it may be worth breaking it into smaller transactions.

DECLARE
  v_cnt number := 1;
BEGIN
 WHILE v_cnt > 0 LOOP
   UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL AND ROWNUM < 50000;
   v_cnt := SQL%ROWCOUNT;
   COMMIT;
 END LOOP;
END;
/

The smaller the ROWNUM limit the less concurrency/locking issues you'll hit, but the more time you'll spend in table scanning.

Solution 3

Vincent already answered your question perfectly, but I'm curious about the "why" behind this action. Why are you updating all NULL's to 0?

Regards, Rob.

Solution 4

Some suggestions:

  1. Drop any indexes that contain FIELD before running your UPDATE statement, and then re-add them later.

  2. Write a PL/SQL procedure to do this that commits after every 1000 or 10000 rows.

Hope this helps.

Share:
65,246
Aubrey
Author by

Aubrey

Updated on July 23, 2022

Comments

  • Aubrey
    Aubrey almost 2 years
    UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL
    

    [TABLE] is an Oracle database table with more than 700 million rows. I cancelled the SQL execution after it had been running for 6 hours.

    Is there any SQL hint that could improve performance? Or any other solution to speed that up?

    EDIT: This query will be run once and then never again.

  • Aubrey
    Aubrey almost 14 years
    Hi Vincent, this is a one-time query. But thanks for covering both scenarios (one time query / recurrent query) in your answer.
  • Aubrey
    Aubrey almost 14 years
    Would that be any faster though?
  • Mark Baker
    Mark Baker almost 14 years
    Nice use of the EXPLAIN PLAN example
  • Mark Baker
    Mark Baker almost 14 years
    It would be faster because it doesn't update the table data in any way, just the table definition
  • Aubrey
    Aubrey almost 14 years
    Ok, but in this case I have to update the existing rows and not only the table definition.
  • Aubrey
    Aubrey almost 14 years
    Good question, Rob. That's because nulls are not tracked in normal indexes.
  • Rob van Wijk
    Rob van Wijk almost 14 years
    In that case, doing an update and changing the semantics of your data seems rather drastic. You can create a function based index, or a regular one on ([field],1).
  • Aubrey
    Aubrey almost 14 years
    Yeah, that's an approach that we've been considering. Thanks
  • Ivan_Bereziuk
    Ivan_Bereziuk almost 14 years
    No you are telling Oracle to pass the default value zero instead of a NULL value whenever it encounters a NULL -- all this takes place on the select processing there is no update to the actual table.
  • Jeffrey Kemp
    Jeffrey Kemp about 11 years
    -1 Sorry, this does not work. Changing the default does NOT update existing rows. (This "cheap update" trick only works if you ADD a new column with both a DEFAULT and a NOT NULL constraint in one operation - in Oracle 11g.)