How many tables is "too many" in a single SQL SELECT?

19,145

Solution 1

A lot of times you can alleviate the visual smell by creating helper views, I do not think there is a hard and fast rule of how many joins are considered bad.

Unlike procedural coding, breaking down SQL into little bits and pieces can result in inefficient queries.

SQL Optimiser will work just fine with tons of table joins, and if you hit an corner case, you can specify the join order or style using hints. In reality I think it is very rare to get queries that join more than say 10 tables, but it is quite feasible that this could happen in a reporting type scenario.

If you discover a situation where you have lots of joins AND have discovered that this particular query is a bottleneck AND you have all the correct indexes in place, you probably need to refactor. However, keep in mind that the large amount of joins may only be a symptom, not the root cause of the issue. The standard practice for query optimisation should be followed (look at profiler, query plan, database structure, logic etc.)

SQL Server uses tempdb anyway for merge joins, so there is usually no need to create temp table just to refactor a single SELECT query.

Solution 2

I also see mammoth queries joining 7-10 tables, but from what I've seen the query optimiser always seems to find the most efficient plan - certainly all the performance issues I see in these sorts of complex issues are usually related to some other problem (such as conditional WHERE statements or nested sub queries)

Solution 3

It's really depends on how big your tables are, even you only joining 2 tables together if it has 100M records, then that's gonna be a slow process anyway.

If you have X records in table a and Y records in table b, if you joining them together, you may get up to x*y records back, in that case the swap memory will be in use during in the process, that's gonna be slow, compare that, the small queries jonly use the CPU L2 cache which has the best performance.

However, if you feel really need to join a lot of tables to achieve the goal , I am suggesting your databases are over normalized, 3rd normalisation is working really well in most of scenario, don't try to spit the information too much , as it recognised to be inefficient for querying.

Yes, if necessary please create a table to cache the results from the heavy query, and updates the fields only when is necessary, or even only once a day.

Share:
19,145
topski
Author by

topski

SQL server DBA, Developer.

Updated on June 13, 2022

Comments

  • topski
    topski about 2 years

    As a DBA for MS SQL 2000 and 2005, I regularly see giant select queries JOINing 7-10 or even more tables. I find, though, that there is a certain point past which performance tends to suffer, and the query becomes very difficult to debug and/or improve.

    So is there a "rule of thumb" for when I should be considering other query methods, like temp tables to hold preliminary results? Or is there a point after which the SQL query optimizer just doesn't do a very good job of figuring out the best plan?