Clustered Indexes without Primary key

sql-server sql-server-2008 sql-server-2005 clustered-index

24,216

Solution 1

No - why?

The clustered index will still store the actual data pages at its leaf level, (initially) physically sorted by the name column.

The index navigation structure above the leaf level will contain the name column values for all rows.

So overall: nothing changes.

The primary key is a logical construct, designed to uniquely identify each row in your table. That's why it has to be unique and non-null.

The clustering index is a physical construct that will (initially) phyiscally sort your data by the clustering key and arrange the SQL Server pages accordingly.

While in SQL Server, the primary is used by default as the clustering key, the two do not have to fall together - nor does one have to exist with the other. You can have a table with a non-clustered primary key, or a clustered table without primary key. Both is possible. Whether it's sensible to have that is another discussion - but it's technically possible.

Update: if your primary key is your clustering key, uniqueness is guaranteed (since the primary key must be unique). If you're choosing some column that is not the primary key as your clustering key, and that column does not guarantee uniqueness, SQL Server will - behind the scenes - add a 4-byte (INT) uniqueifier column to those duplicates values to make them unique. So you might have Smith, Smith1, Smith2 and so forth in your clustered index navigation structure for your Smith's.

See:

Solution 2

If the clustered index is not unique, SQL Server creates a 4-byte uniqueifier and adds it to the clustered index value. The uniqueifier is added only if the clustered index value is duplicate, not for all clustered index values. All nonclustered indexes will contain this value in its leaf level, and non-unique nonclustered index will also have this uniqueifier value in its non-leaf level entry, as a part of bookmark.

Solution 3

Difference between a Primary key and a unique index (or constraint) is that Null values are not allowed in a the primary key column. There is no need to have a primary key on a table but it make things easier for external application to edit the rows in the table and even then, it's not really a necessity with most external applications.

In term of performance, this change nothing. The important is the presence or absence of indexes (either unique or not, clustered or not and with null values or not) and the primary key is essentially simply one more unique index without null value.

For the clustered index, the column doesn't need to be unique and/or without null. A column with duplicates and null values is fine for creating a clustered index.

For a foreign key, it must reference a column with a unique index on it but not necessarily a primary key or without null value. It's perfectly legal to reference a column that is not a primary key and is allowing null value a long as there is a unique index on it. Notice that because there must be an unique index on it, this column cannot have more than a single null value.

There is no limitation on the foreign key column itself (the column on the foreign table) but performance wise, setting an index on it is often a good thing.

24,216

Author by

Nilish

SQL Server - 15M

Updated on July 09, 2022

Comments

Nilish almost 2 years

A clustered index stores the actual data rows at the leaf level of the index. Returning to the example above, that would mean that the entire row of data associated with the primary key value of 123 would be stored in that leaf node.

Question - in case the primary key does not exists and I set the Name column as clustered index. In this case, will the above statement becomes contradictory?