Oracle 11g - How to optimize slow parallel insert select?

30,834

Solution 1

Try using more bind variables, especially where nested loops might happen. I've noticed that you can use it in cases like

CREATE_DT >= :YOUR_DATE instead of CREATE_DT >= sysdate - 60 

I think this would explain why you have 180 million executions in the lowest part of your execution plan even though the whole other part of the update query is still at 8 million out of your 79 million.

Solution 2

Improve statistics. The estimated number of rows is 1, but the actual number of rows is over 7 million and counting. This causes the execution plan to use a nested loop instead of a hash join. A nested loop works better for small amounts of data and a hash join works better for large amounts of data. Fixing that may be as easy as ensuring the relevant tables have accurate, current statistics. This can usually be done by gathering statistics with the default settings, for example: exec dbms_stats.gather_table_stats('SIRS_UATC1', 'TBL_RECON_PM');.

If that doesn't improve the cardinality estimate try using a dynamic sampling hint, such as /*+ dynamic_sampling(5) */. For such a long-running query it is worth spending a little extra time up-front sampling data if it leads to a better plan.

Use statement-level parallelism instead of object-level parallelism. This is probably the most common mistake with parallel SQL. If you use object-level parallelism the hint must reference the alias of the object. Since 11gR2 there is no need to worry about specifying objects. This statement only needs a single hint: INSERT /*+ PARALLEL(16) APPEND */ .... Note that NOLOGGING is not a real hint.

Share:
30,834
robertdi
Author by

robertdi

Updated on July 09, 2022

Comments

  • robertdi
    robertdi over 1 year

    we want to speed up the run of the parallel insert statement below. We are expecting to insert around 80M records and it is taking around 2 hours to finish.

    INSERT /*+ PARALLEL(STAGING_EX,16) APPEND NOLOGGING */ INTO STAGING_EX (ID, TRAN_DT, 
    RECON_DT_START, RECON_DT_END, RECON_CONFIG_ID, RECON_PM_ID) 
    SELECT /*+PARALLEL(PM,16) */ SEQ_RESULT_ID.nextval, sysdate, sysdate, sysdate, 
    '8a038312403e859201405245eed00c42', T1.ID FROM PM T1 WHERE STATUS = 1 and not 
    exists(select 1 from RESULT where T1.ID = RECON_PM_ID and CREATE_DT >= sysdate - 60) and 
    UPLOAD_DT >= sysdate - 1 and (FUND_SRC_TYPE = :1) 
    

    We think that caching the results of the not exist column will speed up the inserts. How do we perform the caching? Any ideas how else to speed up the insert?

    Please see below for plan statistics from Enterprise Manager. Also we noticed that the statements are not being run in parallel. Is this normal?

    insert-select plan statistics from oracle em

    Edit: btw, the sequence is already cached to 1M

  • robertdi
    robertdi almost 10 years
    thanks. after doing this, execution was down to just 11 minutes!
  • Jon Heller
    Jon Heller almost 10 years
    I don't understand why adding a bind variable would improve this query. In general bind variables provide less information to the optimizer, and lead to worse estimates. I have a feeling there's something else going on here. It would be interesting to see the dbms_xplan.display_cursor for both versions of the query.
  • grassbl8d
    grassbl8d almost 10 years
    amazing, worked for me as well. Replaced everything with bind variables. thanks
  • AnBisw
    AnBisw almost 9 years
    Does Statement level prallelism work on insert with column list like INSERT /*+ parallel(16) */ tab (col1,col2) SELECT a1,b2 from tab2; ?? It doesn't seem to be working for me. On the other hand when I do parallelize the SELECT as well, it does work.
  • Jon Heller
    Jon Heller almost 9 years
    @Annjawn I've seen it work many times. Here's an example, although it's not directly runnable because SQL Fiddle doesn't have enough features: sqlfiddle.com/#!4/be5ddf/2
  • AnBisw
    AnBisw almost 9 years
    Interesting. I am not certain as to why it isn't working for me then. All other Parallel DMLs are working just fine except this when there is a column list.
  • Jon Heller
    Jon Heller almost 9 years
    @Annjawn Even with the same example in my SQL Fiddle? That might make an interesting question.