No indexes on small tables?

21,473

Solution 1

The value of indexes is in speeding reads. For instance, if you are doing lots of SELECTs based on a range of dates in a date column, it makes sense to put an index on that column. And of course, generally you add indexes on any column you're going to be JOINing on with any significant frequency. The efficiency gain is also related to the ratio of the size of your typical recordsets to the number of records (i.e. grabbing 20/2000 records benefits more from indexing than grabbing 90/100 records). A lookup on an unindexed column is essentially a linear search.

The cost of indexes comes on writes, because every INSERT also requires an internal insert to each column index.

So, the answer depends entirely on your application -- if it's something like a dynamic website where the number of reads can be 100x or 1000x the writes, and you're doing frequent, disparate lookups based on data columns, indexing may well be beneficial. But if writes greatly outnumber reads, then your tuning should focus on speeding those queries.

It takes very little time to identify and benchmark a handful of your app's most frequent operations both with and without indexes on the JOIN/WHERE columns, I suggest you do that. It's also smart to monitor your production app and identify the most expensive, and most frequent queries, and focus your optimization efforts on the intersection of those two sets of queries (which could mean indexes or something totally different, like allocating more or less memory for query or join caches).

Solution 2

Knuth's wise words are not applicable to the creation (or not) of indexes, since by adding indexes you are not optimising anything directly: you are providing an index that the DBMSs optimiser may use to optimise some queries. In fact, you could better argue that deciding not to index a small table is premature optimisation, as by doing so you restrict the DBMS optimiser's options!

Different DBMSs will have different guidelines for choosing whether or not to index columns based on various factors including table size, and it is these that should be considered.

What is an example of premature optimisation in databases: "denormalising for performance" before any benchmarking has indicated that the normalised database actually has any performance issues.

Solution 3

Primary key columns will be indexed for the unique constraint. I would still index all foreign key columns. The optimizer can choose to ignore your index if it is irrelevant.

If you only have a little bit of data then the extra cost for insert/update should not be significant either.

Solution 4

Absolutely incorrect. 100% incorrect. Don't put a million pointless indexes, but you do want a Primary Key (in most cases), and you do want it CLUSTERED correctly.

Here's why:

SELECT * FROM MySmallTable <-- No worries... Index won't help

SELECT
    *
FROM
    MyBigTable INNER JOIN MySmallTable ON... <-- Ahh, now I'm glad I have my index.

Here's a good rule to go by.

"Since I have a TABLE, I'm likely going to want to query it at some time... If I'm going to query it, I'm likely going to do so in a consistent way..." <-- That's how you should index the table.

EDIT: I'm adding this line: If you have a concrete example in mind, I'll show you how to index it, and how much of a savings you'll get from doing so. Please supply a table, and an example of how you plan in using that table.

Solution 5

It depends. Is the table a reference table?

There are tables of a thousand rows where the absence of an index, and the resulting table scans can make the difference between a fairly simple operation delaying the user by 5 minutes instead of 5 seconds. I have seen exactly this problem, using a DBMS other than SQL Server.

Generally, if the table is a reference table, updates on it will be relatively rare. This means that the performance hit for updating the index will also be relatively rare. If the optimizer passes over the index, the performance hit on the optimizer will be negligible. The space needed to store the index will also be negligible.

If you declare a primary key, you should get an automatic index on that key. That automatic index will almost always do you enough good to justify its cost. Leave it in there. If you create a reference table without a primary key, there are other problems in your design methodology.

If you do frequent searches or frequent joins on some set of columns other than the primary key, an additional index might pay for itself. Don't fix that problem unless it is a problem.

Here's the general rule of thumb: go with the default behavior of the DBMS, unless you find a reason not to. Anything else is a premature preoccupation with optimization on your part.

Share:
21,473
onedaywhen
Author by

onedaywhen

.NET SQL.

Updated on September 08, 2020

Comments

  • onedaywhen
    onedaywhen over 3 years

    "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." (Donald Knuth). My SQL tables are unlikely to contain more than a few thousand rows each (and those are the big ones!). SQL Server Database Engine Tuning Advisor dismisses the amount of data as irrelevant. So I shouldn't even think about putting explicit indexes on these tables. Correct?

  • Mitch Wheat
    Mitch Wheat over 15 years
    If the tables are small enough, have a clustered index (usually the primary key) , and no covering index satisfies a query, the SQL Server optimiser will not use an index. It will instead table scan. This is due to the fact that bookup lookups into the clustered index are expensive.
  • joelhardi
    joelhardi over 15 years
    That makes sense ... row data is stored in leaf nodes of the clustered index, so SQL is searching over the same pages with an index lookup or table scan (roughly speaking). I was answering more generally -- nonclustered indexes on non-PK columns that enable a b-tree lookup in place of a linear scan.
  • Sten Vesterli
    Sten Vesterli over 15 years
    If you add a UNIQUE constraint, the database will always (not just "often") add an index.
  • Tyrael
    Tyrael almost 11 years
    indexes slow down the write operations (as the indexes has to also updated and those updates also needs to be flushed to the disk at some point) plus the index also occupies space on the disk and in memory and the later can cause you more swapping which in turn can cause performance degradation.
  • Jonathan Shields
    Jonathan Shields almost 8 years
    It depends how many logical reads are generated by the statements using these tables. If the tables are queried badly, say for the sake of argument by calling a function which reads the table inline, you may end up with loads of reads even for a small table. I would look at the query plans and use set statistics IO on to examine how many disk reads are being generated on the small table. If its not many, and these parts of the query have a low cost in the query plan, there is probably little point in indexes. Bottom line - it depends how the tables are queried.
  • deFreitas
    deFreitas over 5 years
    Even if you have high number of reads on a small table an index might not be necessary cause database will cache that.
  • deFreitas
    deFreitas over 5 years
    I think the answer is to run the query explain then test it on production using a canary release, feature toggle, etc. The less error prone is to create an index though