Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

sql-server performance database-design indexing clustered-index

10,615

I've read the other question, no idea what OMG ponies means

3 points:

It shouldn't matter if an index is clustered or non-clustered:
It doesn't matter whether time is included too
It just has to be useful

Seek or scan:

Based on statistics, if LaunchDate > @date means, say, 90% of the rows, then most likely a scan will happen. If it is quite selective, then a seek is more likely.

Regardless of clustered or non-clustered!

What index?

A query like this would require an index on LaunchDate and primaryKeyColumn

SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

Now, any non-clustered index refers to the clustered index which is assumed to the PK by default. So primaryKeyColumn is implicitly included already.

Superstition

However, COUNT(primaryKeyColumn) is a superstition. Because PKs do not allow NULL, it is equivalent to

SELECT COUNT(*) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

SELECT COUNT(1) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

So you only need an index on LaunchDate, whether clustered or non-clustered

10,615

Author by

Gennady Vanin Геннадий Ванин

Business Analyst, ex- programmer, biz analyst, Q&A developer in Test, chemical cybernetics engineer, scientist, US patent attorney, translator, tourist guide, traveller, parachuter, tennis player, futebolista, military officer etc.

Updated on June 28, 2022

Comments

Gennady Vanin Геннадий Ванин almost 2 years
Comments to question "How to decrease response time of a simple select query?" tell:
- "What is the data type on LaunchDate? An index isn't likely to do much if it's DATETIME or DATETIME2 because they include the time portion – OMG Ponies"
- "@OMG - Why wouldn't a Clustered Index on a DateTime column improve performance? The query is a range scan which would allow for a fast range index lookup as all data would be in sequential blocks? Semi-related...msdn.microsoft.com/en-us/library/ms177416.aspx – Calgary Coder"
- "Calgary Coder: DATETIME/2 includes time -- an index, clustered or non-clustered, would be good for dates with duplicate times but not ranges. – OMG Ponies"
I created a test table with clustered index on DATETIME type column LaunchDate and observe index seeks for queries similar to cited in above question:
```
SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date
```
instead of table or index scans.

Why wouldn't a clustered index on a DateTime column improve performance?
Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

I'd appreciate a script illustrating that indexing of DATETIME column does not improve performance.

Update: Also, Did OMG imply that index on DATE type column would be helpful but not DATETIME and DATETIME2?
Gennady Vanin Геннадий Ванин over 13 years

The comment to question told about having clustered index on LaunchDate but not primaryKeyColumn. In such a table with 10 rows of data (and 5 columns) I observe Clustered Index Seek if LaunchDate > @date means, say, any number of rows 0%, 10%, 20%, 30%, 90%, 100%
ZygD over 13 years

@vgv8: 10 rows is peanuts. You'd need 10k rows. As I said, it's how useful an index would be, not C or NC. There is no need to use primaryKeyColumn which simplifies the index
Gennady Vanin Геннадий Ванин over 13 years

Why/how does primaryKeyColumn simplify the index?
ZygD over 13 years

@vgv8: sorry, not using it in the index: see my COUNT(*) vs COUNT(1) stuff
Damien_The_Unbeliever about 10 years

I don't know what you're talking about here, but the only timestamp data type defined for SQL Server (for which this question is tagged) has nothing whatsoever to do with dates and times.