SQL formatting standards
Solution 1
I am of the opinion that so long as you can read the source code easily, the formatting is secondary. So long as this objective is achieved, there are a number of good layout styles that can be adopted.
The only other aspect that is important to me is that whatever coding layout/style you choose to adopt in your shop, ensure that it is consistently used by all coders.
Just for your reference, here is how I would present the example you provided, just my layout preference. Of particular note, the ON
clause is on the same line as the join
, only the primary join condition is listed in the join (i.e. the key match) and other conditions are moved to the where
clause.
select
ST.ColumnName1,
JT.ColumnName2,
SJT.ColumnName3
from
SourceTable ST
inner join JoinTable JT on
JT.SourceTableID = ST.SourceTableID
inner join SecondJoinTable SJT on
ST.SourceTableID = SJT.SourceTableID
where
ST.SourceTableID = X
and JT.ColumnName3 = Y
and JT.Column3 = SJT.Column4
One tip, get yourself a copy of SQL Prompt from Red Gate. You can customise the tool to use your desired layout preferences, and then the coders in your shop can all use it to ensure the same coding standards are being adopted by everyone.
Solution 2
Late answer, but hopefully useful.
My experience working as part of the larger development team is that you can go ahead and define any standards you like, but the problem is actually enforcing these or making it very easy for developers to implement.
As developers we sometimes create something that works and then say “I’ll format it later”, but that later never comes.
Initially, we used SQL Prompt (it was great) for this, but then switched to ApexSQL Refactor, because it’s a free tool.
Solution 3
I'm late to the party, but I'll just add my preferred formatting style, which I must've learned from books and manuals: it's compact. Here's the sample SELECT
statement:
SELECT st.column_name_1, jt.column_name_2,
sjt.column_name_3
FROM source_table AS st
INNER JOIN join_table AS jt USING (source_table_id)
INNER JOIN second_join_table AS sjt ON st.source_table_id = sjt.source_table_id
AND jt.column_3 = sjt.column_4
WHERE st.source_table_id = X
AND jt.column_name_3 = Y
In short: 8-space indentation, keywords in caps (although SO colours them better when in lowercase), no camelcase (pointless on Oracle), and line wraps when needed.
The UPDATE
:
UPDATE target_table
SET column_name_1 = @value,
column_name_2 = @value2
WHERE condition_1 = @test
And the INSERT
:
INSERT INTO target_table (column_name_1, column_name_2,
column_name_3)
VALUES (@value1, @value2, @value3)
Now, let me be the first to admit that this style has it's problems. The 8-space indent means that ORDER BY
and GROUP BY
either misalign the indent, or split the word BY
off by itself. It would also be more natural to indent the entire predicate of the WHERE
clause, but I usually align following AND
and OR
operators at the left margin. Indenting after wrapped INNER JOIN
lines is also somewhat arbitrary.
But for whatever reason, I still find it easier to read than the alternatives.
I'll finish with one of my more complex creations of late using this formatting style. Pretty much everything you'd encounter in a SELECT
statement shows up in this one. (It's also been altered to disguise its origins, and I may have introduced errors in so doing.)
SELECT term, student_id,
CASE
WHEN ((ft_credits > 0 AND credits >= ft_credits) OR (ft_hours_per_week > 3 AND hours_per_week >= ft_hours_per_week)) THEN 'F'
ELSE 'P'
END AS status
FROM (
SELECT term, student_id,
pm.credits AS ft_credits, pm.hours AS ft_hours_per_week,
SUM(credits) AS credits, SUM(hours_per_week) AS hours_per_week
FROM (
SELECT e.term, e.student_id, NVL(o.credits, 0) credits,
CASE
WHEN NVL(o.weeks, 0) > 5 THEN (NVL(o.lect_hours, 0) + NVL(o.lab_hours, 0) + NVL(o.ext_hours, 0)) / NVL(o.weeks, 0)
ELSE 0
END AS hours_per_week
FROM enrollment AS e
INNER JOIN offering AS o USING (term, offering_id)
INNER JOIN program_enrollment AS pe ON e.student_id = pe.student_id AND e.term = pe.term AND e.offering_id = pe.offering_id
WHERE e.registration_code NOT IN ('A7', 'D0', 'WL')
)
INNER JOIN student_history AS sh USING (student_id)
INNER JOIN program_major AS pm ON sh.major_code_1 = pm._major_code AND sh.division_code_1 = pm.division_code
WHERE sh.eff_term = (
SELECT MAX(eff_term)
FROM student_history AS shi
WHERE sh.student_id = shi.student_id
AND shi.eff_term <= term)
GROUP BY term, student_id, pm.credits, pm.hours
)
ORDER BY term, student_id
This abomination calculates whether a student is full-time or part-time in a given term. Regardless of the style, this one's hard to read.
Solution 4
Late, but I'll throw my hat in the ring. It takes a bit longer to write, but I find patterns emerge with the vertical alignment that make it very readable once you're used to it.
SELECT ST.ColumnName1,
JT.ColumnName2,
SJT.ColumnName3,
CASE WHEN condition1 = True
AND condition2 = True Then DoSomething
Else DoSomethingElse
END ColumnName4
FROM SourceTable AS ST
INNER
JOIN JoinTable AS JT
ON JT.SourceTableID = ST.SourceTableID
INNER
JOIN SecondJoinTable AS SJT
ON ST.SourceTableID = SJT.SourceTableID
AND JT.Column3 = SJT.Column4
LEFT
JOIN (SELECT Column5
FROM Table4
QUALIFY row_number() OVER
( PARTITION BY pField1,
pField2
ORDER BY oField1
) = 1
) AS subQry
ON SJT.Column5 = subQry.Column5
WHERE ST.SourceTableID = X
AND JT.ColumnName3 = Y
Solution 5
SELECT
a.col1 AS [Column1]
,b.col2 AS [Column2]
,c.col1 AS [Column3]
FROM
Table1 a
INNER JOIN Table2 b ON b.Id = a.bId
INNER JOIN Table3 c ON c.Id = a.cId
WHERE
a.col = X
AND b.col = Y
Uses much more lines than a lot of the examples here, but I feel it's a lot easier to understand, enables quick removal of columns/clauses/tables. It helps to take advantage of a vertically-oriented monitor.
Timbo
Mostly a web developer, MS stack. Also a photographer and cyclist.
Updated on January 14, 2021Comments
-
Timbo over 3 years
In my last job, we worked on a very database-heavy application, and I developed some formatting standards so that we would all write SQL with a common layout. We also developed coding standards, but these are more platform-specific so I'll not go into them here.
I'm interested to know what other people use for SQL formatting standards. Unlike most other coding environments, I haven't found much of a consensus online for them.
To cover the main query types:
select ST.ColumnName1, JT.ColumnName2, SJT.ColumnName3 from SourceTable ST inner join JoinTable JT on JT.SourceTableID = ST.SourceTableID inner join SecondJoinTable SJT on ST.SourceTableID = SJT.SourceTableID and JT.Column3 = SJT.Column4 where ST.SourceTableID = X and JT.ColumnName3 = Y
There was some disagreement about line feeds after
select
,from
andwhere
. The intention on the select line is to allow other operators such as "top X" without altering the layout. Following on from that, simply keeping a consistent line feed after the key query elements seemed to result in a good level of readability.Dropping the linefeed after the
from
andwhere
would be an understandable revision. However, in queries such as theupdate
below, we see that the line feed after thewhere
gives us good column alignment. Similarly, a linefeed aftergroup by
ororder by
keeps our column layouts clear and easy to read.update TargetTable set ColumnName1 = @value, ColumnName2 = @value2 where Condition1 = @test
Finally, an
insert
:insert into TargetTable ( ColumnName1, ColumnName2, ColumnName3 ) values ( @value1, @value2, @value3 )
For the most part, these don't deviate that far from the way MS SQL Server Managements Studio / query analyser write out SQL, however they do differ.
I look forward to seeing whether there is any consensus in the Stack Overflow community on this topic. I'm constantly amazed how many developers can follow standard formatting for other languages and suddenly go so random when hitting SQL.