SQL formatting standards

46,795

Solution 1

I am of the opinion that so long as you can read the source code easily, the formatting is secondary. So long as this objective is achieved, there are a number of good layout styles that can be adopted.

The only other aspect that is important to me is that whatever coding layout/style you choose to adopt in your shop, ensure that it is consistently used by all coders.

Just for your reference, here is how I would present the example you provided, just my layout preference. Of particular note, the ON clause is on the same line as the join, only the primary join condition is listed in the join (i.e. the key match) and other conditions are moved to the where clause.

select
    ST.ColumnName1,
    JT.ColumnName2,
    SJT.ColumnName3
from 
    SourceTable ST
inner join JoinTable JT on 
    JT.SourceTableID = ST.SourceTableID
inner join SecondJoinTable SJT on 
    ST.SourceTableID = SJT.SourceTableID
where
        ST.SourceTableID = X
    and JT.ColumnName3 = Y
    and JT.Column3 = SJT.Column4

One tip, get yourself a copy of SQL Prompt from Red Gate. You can customise the tool to use your desired layout preferences, and then the coders in your shop can all use it to ensure the same coding standards are being adopted by everyone.

Solution 2

Late answer, but hopefully useful.

My experience working as part of the larger development team is that you can go ahead and define any standards you like, but the problem is actually enforcing these or making it very easy for developers to implement.

As developers we sometimes create something that works and then say “I’ll format it later”, but that later never comes.

Initially, we used SQL Prompt (it was great) for this, but then switched to ApexSQL Refactor, because it’s a free tool.

Solution 3

I'm late to the party, but I'll just add my preferred formatting style, which I must've learned from books and manuals: it's compact. Here's the sample SELECT statement:

SELECT  st.column_name_1, jt.column_name_2,
        sjt.column_name_3
FROM    source_table AS st
        INNER JOIN join_table AS jt USING (source_table_id)
        INNER JOIN second_join_table AS sjt ON st.source_table_id = sjt.source_table_id
                AND jt.column_3 = sjt.column_4
WHERE   st.source_table_id = X
AND     jt.column_name_3 = Y

In short: 8-space indentation, keywords in caps (although SO colours them better when in lowercase), no camelcase (pointless on Oracle), and line wraps when needed.

The UPDATE:

UPDATE  target_table
SET     column_name_1 = @value,
        column_name_2 = @value2
WHERE   condition_1 = @test

And the INSERT:

INSERT  INTO target_table (column_name_1, column_name_2,
                column_name_3)
VALUES  (@value1, @value2, @value3)

Now, let me be the first to admit that this style has it's problems. The 8-space indent means that ORDER BY and GROUP BY either misalign the indent, or split the word BY off by itself. It would also be more natural to indent the entire predicate of the WHERE clause, but I usually align following AND and OR operators at the left margin. Indenting after wrapped INNER JOIN lines is also somewhat arbitrary.

But for whatever reason, I still find it easier to read than the alternatives.

I'll finish with one of my more complex creations of late using this formatting style. Pretty much everything you'd encounter in a SELECT statement shows up in this one. (It's also been altered to disguise its origins, and I may have introduced errors in so doing.)

SELECT  term, student_id,
        CASE
            WHEN ((ft_credits > 0 AND credits >= ft_credits) OR (ft_hours_per_week > 3 AND hours_per_week >= ft_hours_per_week)) THEN 'F'
            ELSE 'P'
        END AS status
FROM    (
        SELECT  term, student_id,
                pm.credits AS ft_credits, pm.hours AS ft_hours_per_week,
                SUM(credits) AS credits, SUM(hours_per_week) AS hours_per_week
        FROM    (
                SELECT  e.term, e.student_id, NVL(o.credits, 0) credits,
                        CASE
                            WHEN NVL(o.weeks, 0) > 5 THEN (NVL(o.lect_hours, 0) + NVL(o.lab_hours, 0) + NVL(o.ext_hours, 0)) / NVL(o.weeks, 0)
                            ELSE 0
                        END AS hours_per_week
                FROM    enrollment AS e
                        INNER JOIN offering AS o USING (term, offering_id)
                        INNER JOIN program_enrollment AS pe ON e.student_id = pe.student_id AND e.term = pe.term AND e.offering_id = pe.offering_id
                WHERE   e.registration_code NOT IN ('A7', 'D0', 'WL')
                )
                INNER JOIN student_history AS sh USING (student_id)
                INNER JOIN program_major AS pm ON sh.major_code_1 = pm._major_code AND sh.division_code_1 = pm.division_code
        WHERE   sh.eff_term = (
                        SELECT  MAX(eff_term)
                        FROM    student_history AS shi
                        WHERE   sh.student_id = shi.student_id
                        AND     shi.eff_term <= term)
        GROUP   BY term, student_id, pm.credits, pm.hours
        )
ORDER   BY term, student_id

This abomination calculates whether a student is full-time or part-time in a given term. Regardless of the style, this one's hard to read.

Solution 4

Late, but I'll throw my hat in the ring. It takes a bit longer to write, but I find patterns emerge with the vertical alignment that make it very readable once you're used to it.

SELECT ST.ColumnName1,
       JT.ColumnName2,
       SJT.ColumnName3,
       CASE WHEN condition1 = True 
             AND condition2 = True Then DoSomething
            Else DoSomethingElse
        END ColumnName4
  FROM SourceTable AS ST
 INNER
  JOIN JoinTable AS JT
    ON JT.SourceTableID = ST.SourceTableID
 INNER
  JOIN SecondJoinTable AS SJT
    ON ST.SourceTableID = SJT.SourceTableID
   AND JT.Column3 = SJT.Column4
  LEFT
  JOIN (SELECT Column5
          FROM Table4
       QUALIFY row_number() OVER
                 ( PARTITION BY pField1,
                                pField2
                       ORDER BY oField1
                 ) = 1
       ) AS subQry
    ON SJT.Column5 = subQry.Column5
 WHERE ST.SourceTableID = X
   AND JT.ColumnName3 = Y

Solution 5

SELECT
    a.col1                  AS [Column1]
    ,b.col2                 AS [Column2]
    ,c.col1                 AS [Column3]
FROM
    Table1 a
    INNER JOIN Table2 b     ON b.Id = a.bId
    INNER JOIN Table3 c     ON c.Id = a.cId
WHERE
    a.col     = X
    AND b.col = Y

Uses much more lines than a lot of the examples here, but I feel it's a lot easier to understand, enables quick removal of columns/clauses/tables. It helps to take advantage of a vertically-oriented monitor.

Share:
46,795
Timbo
Author by

Timbo

Mostly a web developer, MS stack. Also a photographer and cyclist.

Updated on January 14, 2021

Comments

  • Timbo
    Timbo over 3 years

    In my last job, we worked on a very database-heavy application, and I developed some formatting standards so that we would all write SQL with a common layout. We also developed coding standards, but these are more platform-specific so I'll not go into them here.

    I'm interested to know what other people use for SQL formatting standards. Unlike most other coding environments, I haven't found much of a consensus online for them.

    To cover the main query types:

    select
        ST.ColumnName1,
        JT.ColumnName2,
        SJT.ColumnName3
    from 
        SourceTable ST
    inner join JoinTable JT
        on JT.SourceTableID = ST.SourceTableID
    inner join SecondJoinTable SJT
        on ST.SourceTableID = SJT.SourceTableID
        and JT.Column3 = SJT.Column4
    where
        ST.SourceTableID = X
        and JT.ColumnName3 = Y
    

    There was some disagreement about line feeds after select, from and where. The intention on the select line is to allow other operators such as "top X" without altering the layout. Following on from that, simply keeping a consistent line feed after the key query elements seemed to result in a good level of readability.

    Dropping the linefeed after the from and where would be an understandable revision. However, in queries such as the update below, we see that the line feed after the where gives us good column alignment. Similarly, a linefeed after group by or order by keeps our column layouts clear and easy to read.

    update
        TargetTable
    set
        ColumnName1 = @value,
        ColumnName2 = @value2
    where
        Condition1 = @test
    

    Finally, an insert:

    insert into TargetTable (
        ColumnName1,
        ColumnName2,
        ColumnName3
    ) values (
        @value1,
        @value2,
        @value3
    )
    

    For the most part, these don't deviate that far from the way MS SQL Server Managements Studio / query analyser write out SQL, however they do differ.

    I look forward to seeing whether there is any consensus in the Stack Overflow community on this topic. I'm constantly amazed how many developers can follow standard formatting for other languages and suddenly go so random when hitting SQL.