Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

10,615

I've read the other question, no idea what OMG ponies means

3 points:

  • It shouldn't matter if an index is clustered or non-clustered:
  • It doesn't matter whether time is included too
  • It just has to be useful

Seek or scan:

Based on statistics, if LaunchDate > @date means, say, 90% of the rows, then most likely a scan will happen. If it is quite selective, then a seek is more likely.

Regardless of clustered or non-clustered!

What index?

A query like this would require an index on LaunchDate and primaryKeyColumn

SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

Now, any non-clustered index refers to the clustered index which is assumed to the PK by default. So primaryKeyColumn is implicitly included already.

Superstition

However, COUNT(primaryKeyColumn) is a superstition. Because PKs do not allow NULL, it is equivalent to

SELECT COUNT(*) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

SELECT COUNT(1) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

So you only need an index on LaunchDate, whether clustered or non-clustered

Share:
10,615
Gennady Vanin Геннадий Ванин
Author by

Gennady Vanin Геннадий Ванин

Business Analyst, ex- programmer, biz analyst, Q&A developer in Test, chemical cybernetics engineer, scientist, US patent attorney, translator, tourist guide, traveller, parachuter, tennis player, futebolista, military officer etc.

Updated on June 28, 2022

Comments

  • Gennady Vanin Геннадий Ванин
    Gennady Vanin Геннадий Ванин almost 2 years

    Comments to question "How to decrease response time of a simple select query?" tell:

    • "What is the data type on LaunchDate? An index isn't likely to do much if it's DATETIME or DATETIME2 because they include the time portion – OMG Ponies"

    • "@OMG - Why wouldn't a Clustered Index on a DateTime column improve performance? The query is a range scan which would allow for a fast range index lookup as all data would be in sequential blocks? Semi-related...msdn.microsoft.com/en-us/library/ms177416.aspx – Calgary Coder"

    • "Calgary Coder: DATETIME/2 includes time -- an index, clustered or non-clustered, would be good for dates with duplicate times but not ranges. – OMG Ponies"

    I created a test table with clustered index on DATETIME type column LaunchDate and observe index seeks for queries similar to cited in above question:

    SELECT COUNT(primaryKeyColumn) 
    FROM   MarketPlan 
    WHERE  LaunchDate > @date
    

    instead of table or index scans.

    Why wouldn't a clustered index on a DateTime column improve performance?
    Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

    I'd appreciate a script illustrating that indexing of DATETIME column does not improve performance.

    Update: Also, Did OMG imply that index on DATE type column would be helpful but not DATETIME and DATETIME2?

  • Gennady Vanin Геннадий Ванин
    Gennady Vanin Геннадий Ванин over 13 years
    The comment to question told about having clustered index on LaunchDate but not primaryKeyColumn. In such a table with 10 rows of data (and 5 columns) I observe Clustered Index Seek if LaunchDate > @date means, say, any number of rows 0%, 10%, 20%, 30%, 90%, 100%
  • ZygD
    ZygD over 13 years
    @vgv8: 10 rows is peanuts. You'd need 10k rows. As I said, it's how useful an index would be, not C or NC. There is no need to use primaryKeyColumn which simplifies the index
  • Gennady Vanin Геннадий Ванин
    Gennady Vanin Геннадий Ванин over 13 years
    Why/how does primaryKeyColumn simplify the index?
  • ZygD
    ZygD over 13 years
    @vgv8: sorry, not using it in the index: see my COUNT(*) vs COUNT(1) stuff
  • Damien_The_Unbeliever
    Damien_The_Unbeliever about 10 years
    I don't know what you're talking about here, but the only timestamp data type defined for SQL Server (for which this question is tagged) has nothing whatsoever to do with dates and times.