Creating Indexes for Group By Fields?

50,111

Solution 1

It could be correct, but that would depend on how much data you have. Typically I would create an index for the columns I was using in a GROUP BY, but in your case the optimizer may have decided that after using the field_two index that there wouldn't be enough data returned to justify using the other index for the GROUP BY.

Solution 2

No, this can be incorrect.

If you have a large table, Oracle can prefer deriving the fields from the indexes rather than from the table, even there is no single index that covers all values.

In the latest article in my blog:

, there is a query in which Oracle does not use full table scan but rather joins two indexes to get the column values:

SELECT  l.id, l.value
FROM    t_left l
WHERE   NOT EXISTS
        (
        SELECT  value
        FROM    t_right r
        WHERE   r.value = l.value
        )

The plan is:

SELECT STATEMENT
 HASH JOIN ANTI
  VIEW , 20090917_anti.index$_join$_001
   HASH JOIN
    INDEX FAST FULL SCAN, 20090917_anti.PK_LEFT_ID
    INDEX FAST FULL SCAN, 20090917_anti.IX_LEFT_VALUE
  INDEX FAST FULL SCAN, 20090917_anti.IX_RIGHT_VALUE

As you can see, there is no TABLE SCAN on t_left here.

Instead, Oracle takes the indexes on id and value, joins them on rowid and gets the (id, value) pairs from the join result.

Now, to your query:

SELECT  *
FROM    some_table
WHERE   field_one is not null and field_two = ?
GROUP BY
        field_three, field_four, field_five

First, it will not compile, since you are selecting * from a table with a GROUP BY clause.

You need to replace * with expressions based on the grouping columns and aggregates of the non-grouping columns.

You will most probably benefit from the following index:

CREATE INDEX ix_sometable_23451 ON some_table (field_two, field_three, field_four, field_five, field_one)

, since it will contain everything for both filtering on field_two, sorting on field_three, field_four, field_five (useful for GROUP BY) and making sure that field_one is NOT NULL.

Solution 3

Do you need to create an index for fields of group by fields in an Oracle database?

No. You don't need to, in the sense that a query will run irrespective of whether any indexes exist or not. Indexes are provided to improve query performance.

It can, however, help; but I'd hesitate to add an index just to help one query, without thinking about the possible impact of the new index on the database.

...the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?

Not always. Often a GROUP BY will require Oracle to perform a sort (but not always); and you can eliminate the sort operation by providing a suitable index on the column(s) to be sorted.

Whether you actually need to worry about the GROUP BY performance, however, is an important question for you to think about.

Share:
50,111
onejigtwojig
Author by

onejigtwojig

Updated on September 22, 2020

Comments

  • onejigtwojig
    onejigtwojig over 3 years

    Do you need to create an index for fields of group by fields in an Oracle database?

    For example:

    select * 
    from some_table
    where field_one is not null and field_two = ?
    group by field_three, field_four, field_five
    

    I was testing the indexes I created for the above and the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?

  • Ben S
    Ben S over 14 years
    +1 for mentioning the optimizer, it's probably the most likely cause of this.
  • onejigtwojig
    onejigtwojig over 14 years
    Thank you for the response. I didn't realize the explain plan is dependent on the amount of data in the table. There is currently no data in the table, which explains why the optimizer may have skipped the other indexes. On another note, would a composite index on just field_three and field_four without field_five still be used in query like the one above? This would not include all the fields in the group by clause.
  • Eric Petroelje
    Eric Petroelje over 14 years
    @Mark - that explains it. See my edits for more info about what indexes Oracle could use for the group by.
  • Eric Petroelje
    Eric Petroelje over 14 years
    Very interesting - I don't think I've ever seen that before (where Oracle would join two indexes and avoid the table entirely)
  • Quassnoi
    Quassnoi over 14 years
    @Eric Petroelje: there is a special hint, INDEX_JOIN, to force this method.
  • Eric Petroelje
    Eric Petroelje over 14 years
    @jva - Not sure what you mean about grouping order being relevant to the optimizer. It's only relevant in terms of what indexes can be used. Basicly the order of the columns in the index should be the same as the order of the columns in your group by clause.
  • E_the_Shy
    E_the_Shy over 14 years
    @Eric: It seems unlikely to me that the GROUP BY order matters to Oracle. Changing the order of columns in the GROUP BY does not change the semantics of the query and so should not change the plan. Proof?
  • Eric Petroelje
    Eric Petroelje over 14 years
    @WW & jva - Now that I think about it some more, seems like you are right. I was thinking the semantics of GROUP BY would be the same as ORDER By, but that just isn't the case. As WW mentioned, the order of the GROUP BY columns makes no difference whatsoever. Thanks for pointing that out! - I'll delete the second part of my answer to remove the misinformation.
  • waltersu
    waltersu almost 7 years
    upvote for "can eliminate the sort operation..". BTW this article has more details use-the-index-luke.com/sql/sorting-grouping/indexed-group-by
  • Saran
    Saran almost 3 years
    Which would be a better option in this scenario? Create 3 indices for the 3 columns in the group by or 1 composite index for all the 3 columns together?