SQL Server: Clustered index on datetime, ASC or DESC

24,018

Solution 1

Doesn't really matter - but is the DateTime really guaranteed to be unique?? I would AVOID putting a clustered index on just a DateTime - I would use a INT IDENTITY or BIGINT IDENTITY instead, and put a regular non-clustered index on DateTime (since that's really not guaranteed to be unique......)

Marc

PS: Like a primary key, the general consensus on what a clustered key should be is:

unique (otherwise SQL Server will "uniquify" it by adding a 4-byte uniqueifier to it)
as narrow as possible
static (never change)
ever increasing

The column(s) that make up the clustered key (including that 4-byte uniqueifier) are added to EVERY ENTRY in EVERY non-clustered index - so you want to keep those as slim as possible.

PS 2: the clustering key(s) are added to each non-clustered index because that's the way that SQL Server will retrieve the whole rows once it's found the search value in the non-clustered index. It's the row's "location" in the database, so to speak. Therefore, it should be unique and narrow.

Solution 2

Read this http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx

If read was frequently based on datetime field, the good choice is a composite key of date and identity - in that order (date, identity).

24,018

Author by

Eyvind

Updated on July 09, 2022

Comments

Eyvind almost 2 years

If I have an SQL Server table with a clustered index on a datetime field, that is set to DateTime.Now (from C#) before inserts, should the index be ascending or descending to avoid reorganization of the table?

Thanks.
marc_s almost 15 years

If it's NOT unique, then SQL Server will automagically add a 4-byte "uniqueifier" - if ever possible, try to avoid that!
Eyvind almost 15 years

Thanks, didn't know that. So, given that the table in question has a PK that is a uniqueidentifier, would it be better to create the clustered index on the datetime field AND the PK?
marc_s almost 15 years

OH GOD NO!! The PK is a GUID = 16 Byte, and the DATETIME is another 8 BYTE - I would really just insert a INT IDENTITY(1,1) (4-byte) and be done with it!
marc_s almost 15 years

Also, GUID's make for very bad performance when used as clustering keys, since they're by nature totally random. This causes lots of index fragmentation and poor performance as a rule of thumb. Avoid GUID's in clustered indices!
ses011 over 11 years

A clustered index doesn't need to be unique. SQL Server includes a clustered index with a primary key column by default, so it is the PK that is unique.
marc_s over 11 years

@Spacemoses: a clustered index must be unique - if you don't make it unique, SQL Server will add a 4-byte uniquifier to make it unique.
ses011 over 11 years

Interesting, thank you for the distinction. So by definition it must be unique, but SQL Server gives you a convenience of having it on a non-unique column and managing the uniqueness behind the scenes.
marc_s over 11 years

@Spacemoses: SQL Server will save your butt if you're too lazy to make sure the clustering key chosen is really unique - yes. But it comes at a price of an extra 4 byte overhead (per duplicated row) - not necessarily something you want to "rely on" too many times!
ses011 over 11 years

Yes, I can't see too many people creating a clustered index though without a primary key along with it anyway. You would actually have to make an effort to do that.
AndyClaw about 11 years

I can't stress how helpful it can be to actually have your DateTime as the first field in a composite clustered key. It might not mesh with the perfect model of always having an int identity, but when you are constantly querying, say, log entries within the last 1 hour, the problem lends itself to having a DateTime as part of the primary key. Especially if your DateTime field is less than 95% selective (meaning lots of non-unique values) its the way you are going to hit the index and get your performance you would expect.
Chris over 10 years

I realize this is a rather old question and answer, but I'm in a similar situation to the OP. I'm storing post history to a table. Old posts are never deleted, so when a post is edited, a new one is created and the old one can be viewed in the change_log. Would it be ideal to create a clustered index on the date_time (if I'm selecting by a foreignkey and most recent datetime)?
marc_s over 10 years

@Chris: don't use DATETIME alone - it's not precise enough (only down to 3.33ms - there could be duplicates!). If your history table has an (BIG)INT IDENTITY - use that as your clustering key. Or if you insist - use ID + DATETIME - but I'd recommend against using DATETIME alone.