Automatically match columns in INSERT INTO ... SELECT ... FROM

64,322

Solution 1

If T1 and T2 match exactly you have two choices. You can either select all columns from T2 for the insert into T1, or you can provide a column list to the insert statement.

Even though when you do a select MSSQL provides column headers that information is not used by an insert statement to match columns up.

Solution 2

Always use explicit columns both in the INSERT and in the SELECT projection. Even if you don't want to, you should:

INSERT INTO T1 (C1, c2)
SELECT C1, C2 FROM T2

Solution 3

Yes, you can omit the field names for the table that you insert to, and you can use select * to get all fields from the table, but I would not recommend this approach.

If you omit the field name the fields are matched by position, not by name. If the fields are not in the exact same order, they will be mixed up. Generally you should avoid relying on the exact layout of the tables, to minimise the risk that changes in the tables breaks the queries.

Solution 4

The other answers are good but they don't explain why it is bad to use:

INSERT INTO T1
SELECT * FROM T2

In a comment the OP talks about code duplication when specifying columns when using the safer approach:

INSERT INTO T1 (C1, c2)
SELECT C1, C2 FROM T2

Yet if you don't be specific you are relying upon the number of columns always matching and the ordering of the columns doing what you expect. That logic will break if one of the tables is altered to add a column.

Also you can get into trouble with silent bugs. If you use a tables with the same number of columns but different positions:


CREATE TABLE tab1 (col1 int, col2 string);

CREATE TABLE tab2 (col1 string, col2 int);

INSERT INTO tab1 values(1, 'aaa');

INSERT INTO TABLE tab2 select * FROM tab1;

Then you might have hoped that you did a copy such that tab1 and tab2 are the same. What I wanted was:

+-------------------+-------------------+
| tab2.col1         | tab2.col2         |
+-------------------+-------------------+
| 1                 | aaa               |
+-------------------+-------------------+

But it will load based on column position and cast the data so what I get is:

+-------------------+-------------------+
| tab2.col1         | tab2.col2         |
+-------------------+-------------------+
| 1                 | NULL              |
+-------------------+-------------------+

What happened was it could not convert a string into int so it set it to NULL. It could convert the int into a string as '1' which is no longer a number type.

Even if the columns do match anyone can do:

ALTER TABLE tab1 ADD COLUMNS (col3 string COMMENT 'a new column');

After that the query that does not specify columns will break saying that the number of columns in the two tables does not match. It will no longer be able to move data into tab2.

This means that the safe thing to do is to be explicit with SQL:

INSERT INTO T1 (C1, c2)
SELECT C1, C2 FROM T2

If someone is just trying to quickly take a copy of a table then some SQL engines support

CREATE TABLE tab3 AS SELECT * FROM tab1;

In that case sure typing columns is a waste of time and if someone was to add columns to tab1 before you cloned it being explicit would fail to clone the new column. All that counter example shows is that there are no absolute rules in programming only rules of thumb. The rule of thumb for SQL (and any other loosly typed language with implicit conversions) is to be as specific as you can if you don't want silent error at runtime and bugs when someone adds new functionality down the line.

Share:
64,322
Konstantin Spirin
Author by

Konstantin Spirin

Passionate software developer

Updated on March 20, 2020

Comments

  • Konstantin Spirin
    Konstantin Spirin about 4 years

    SQL Server question. When doing

    INSERT INTO T1 SELECT (C1, C2) FROM T2
    

    I don't want to specify column names of T1 because they are the same as in T2

    Is it possible to do so?

    Currently I'm getting error

    Msg 213, Level 16, State 1, Line 1

    Column name or number of supplied values does not match table definition.