Import multiple delimited text files into a SQL Server database and automatically create tables

11,972

Solution 1

This script allows you to import multiple delimited text files into a SQL database. The tables into which the data is imported, including all required columns, are created automatically. The script includes some documentation.

/*
**  This file was created by Laurens Bogaardt, Advisor Data Analytics at EY Amsterdam on 2016-11-03.
**  This script allows you to import multiple delimited text files into a SQL database. The tables 
**  into which the data is imported, including all required columns, are created automatically. This 
**  script uses tab-delimited (tsv) files and SQL Server Management Studio. The script may need some 
**  minor adjustments for other formats and tools. The scripts makes several assumptions which need 
**  to be valid before it can run properly. First of all, it assumes none of the output tables exist 
**  in the SQL tool before starting. Therefore, it may be necessary to clean the database and delete 
**  all the existing tables. Secondly, the script assumes that, if multiple text files are imported 
**  into the same output table, the number and order of the columns of these files is identical. If 
**  this is not the case, some manual work may need to be done to the text files before importing.
**  Finally, please note that this script only imports data as strings (to be precise, as NVARCHAR's
**  of length 255). It does not allow you to specify the datatype per column. This would need to be 
**  done using another script after importing the data as strings.
*/

-- 1.   Import Multiple Delimited Text Files into a SQL Database

-- 1.1  Define the path to the input and define the terminators

/*
**  In this section, some initial parameters are set. Obviously, the 'DatabaseName' refers to the 
**  database in which you want to create new tables. The '@Path' parameter sets the folder in 
**  which the text files are located which you want to import. Delimited files are defined by 
**  two characters: one which separates columns and one which separates rows. Usually, the 
**  row-terminator is the newline character CHAR(10), also given by '\n'. When files are created 
**  in Windows, the row-terminator often includes a carriage return CHAR(13), also given by '\r\n'. 
**  Often, a tab is used to separate each column. This is given by CHAR(9) or by the character '\t'. 
**  Other useful characters include the comma CHAR(44), the semi-colon CHAR(59) and the pipe 
**  CHAR(124).
*/

USE [DatabaseName]
DECLARE @Path NVARCHAR(255) = 'C:\\PathToFiles\\'
DECLARE @RowTerminator NVARCHAR(5) = CHAR(13) + CHAR(10)
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9)

-- 1.2  Define the list of input and output in a temporary table

/*
**  In this section, a temporary table is created which lists all the filenames of the delimited 
**  files which need to be imported, as well as the names of the tables which are created and into 
**  which the data is imported. Multiple files may be imported into the same output table. Each row 
**  is prepended with an integer which increments up starting from 1. It is essential that this 
**  number follows this logic. The temporary table is deleted at the end of this script.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
    [ID] INT
    , [FileName] NVARCHAR(255)
    , [TableName] NVARCHAR(255)
);

INSERT INTO [dbo].[Files_Temporary] SELECT 1,   'MyFileA.txt',  'NewTable1'
INSERT INTO [dbo].[Files_Temporary] SELECT 2,   'MyFileB.txt',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 3,   'MyFileC.tsv',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 4,   'MyFileD.csv',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 5,   'MyFileE.dat',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 6,   'MyFileF',      'NewTable3'
INSERT INTO [dbo].[Files_Temporary] SELECT 7,   'MyFileG.text', 'NewTable4'
INSERT INTO [dbo].[Files_Temporary] SELECT 8,   'MyFileH.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 9,   'MyFileI.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 10,  'MyFileJ.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 11,  'MyFileK.txt',  'NewTable6'

-- 1.3  Loop over the list of input and output and import each file to the correct table

/*
**  In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined 
**  which starts at '1' and increments with each iteration. The filename and tablename are retrieved 
**  from the previously defined temporary table. The next step of the script is to check whether the 
**  output table already exists or not.
*/

DECLARE @Counter INT = 1

WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
    PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'

    DECLARE @FileName NVARCHAR(255)
    DECLARE @TableName NVARCHAR(255)
    DECLARE @Header NVARCHAR(MAX)
    DECLARE @SQL_Header NVARCHAR(MAX)
    DECLARE @CreateHeader NVARCHAR(MAX) = ''
    DECLARE @SQL_CreateHeader NVARCHAR(MAX)

    SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter

    IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
    BEGIN
/*
**  If the output table does not yet exist, it needs to be created. This requires the list of all 
**  columnnames for that table to be retrieved from the first line of the text file, which includes 
**  the header. A piece of SQL code is generated and executed which imports the header of the text 
**  file. A second temporary table is created which stores this header as a single string.
*/
        PRINT 'Creating new table with name ''' + @TableName + '''.'

        IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
        DROP TABLE [dbo].[Header_Temporary];
        CREATE TABLE [dbo].[Header_Temporary]
        (
            [Header] NVARCHAR(MAX)
        );

        SET @SQL_Header = '
            BULK INSERT [dbo].[Header_Temporary]
            FROM ''' + @Path + @FileName + '''
            WITH
            (
                FIRSTROW = 1,
                LASTROW = 1,
                MAXERRORS = 0,
                FIELDTERMINATOR = ''' + @RowTerminator + ''',
                ROWTERMINATOR = ''' + @RowTerminator + '''
            )'
        EXEC(@SQL_Header)

        SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
        PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
**  The columnnames in the header are separated using the column-terminator. This can be used to loop 
**  over each columnname. A new piece of SQL code is generated which will create the output table 
**  with the correctly named columns.
*/
        WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
        BEGIN          
            SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(255), '
            SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header)) 
        END
        SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(255)'

        SET @SQL_CreateHeader = 'CREATE TABLE [' + @TableName + '] (' + @CreateHeader + ')'
        EXEC(@SQL_CreateHeader)
    END

/*
**  Finally, the data from the text file is imported into the newly created table. The first line, 
**  including the header information, is skipped. If multiple text files are imported into the same 
**  output table, it is essential that the number and the order of the columns is identical, as the 
**  table will only be created once, using the header information of the first text file.
*/
    PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
    DECLARE @SQL NVARCHAR(MAX)
    SET @SQL = '
        BULK INSERT [dbo].[' + @TableName + ']
        FROM ''' + @Path + @FileName + '''
        WITH
        (
            FIRSTROW = 2,
            MAXERRORS = 0,
            FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
            ROWTERMINATOR = ''' + @RowTerminator + '''
        )'
    EXEC(@SQL)

    SET @Counter = @Counter + 1
END;

-- 1.4  Cleanup temporary tables

/*
**  In this section, the temporary tables which were created and used by this script are deleted. 
**  Alternatively, the script could have used 'real' temporary table (identified by the '#' character 
**  in front of the name) or a table variable. These would have deleted themselves once they were no 
**  longer in use. However, the end result is the same.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];

IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];

Solution 2

Inserting whole folder with delimited .txt or .csv files into SQL server

Note: Don't be scared of lenghty script you'll see here. There is only 3 variables that needs changing and whole script should work fine.

This solution is upgrade from accepted answer (@LBogaardt) and it also implement suggestion from @Chendur Mar for getting all files from folder.

My additions:

  • this solution works with UTF-8 files
  • imports in NVARCHAR(MAX) instead in NVARCHAR(255) - you can change that if you wish
  • has error logging implemented so you can see on which row import breaks in case that happens

STEP 1 - ENABLE xp_cmdshell

See here on how to do that.

STEP 2 - ENABLE permissions for everyone on Import folder

Keep on your mind - Import folder is remote folder on server. So you need to create folder on server and upload your files there. Set permissions on that folder following this.

STEP 3 - Edit script parameters and execute it

You only have to change first 4 lines:

  1. Line - instead of yourDatabase, put name of your database

  2. Line - define location of your Import folder where your .txt .csv files are

  3. Line - define row terminator which is most likely new line(\n) so keep it as is

  4. Line - define separator for your files - if you use comma than put CHAR(44) or ','. CHAR(9) is TAB.

Script:

USE yourDatabase
DECLARE @Location NVARCHAR(MAX) = 'C:\Users\username\Desktop\Import\';
DECLARE @RowTerminator NVARCHAR(5) = '\n';
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9);

DECLARE @SQLINSERT NVARCHAR(MAX);


-- 1.2  Define the list of input and output in a temporary table

/*
**  In this section, a temporary table is created which lists all the filenames of the delimited
**  files which need to be imported, as well as the names of the tables which are created and into
**  which the data is imported. Multiple files may be imported into the same output table. Each row
**  is prepended with an integer which increments up starting from 1. It is essential that this
**  number follows this logic. The temporary table is deleted at the end of this script.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
    [ID] INT identity (1,1) primary key
    , [FileName] NVARCHAR(max)
    , [TableName] NVARCHAR(max)
);

--insert names into  [dbo].[Files_Temporary] 
SET @SQLINSERT = 'INSERT INTO [dbo].[Files_Temporary] (filename) exec master.dbo.xp_cmdshell' + char(39) + ' dir ' + @Location + ' /b /a-d' + char(39)
EXEC(@SQLINSERT)
------Update table names eliminating the file extension-------
update [dbo].[Files_Temporary] set [TableName]= SUBSTRING(filename,0, CHARINDEX('.',filename))

-- 1.3  Loop over the list of input and output and import each file to the correct table

/*
**  In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined
**  which starts at '1' and increments with each iteration. The filename and tablename are retrieved
**  from the previously defined temporary table. The next step of the script is to check whether the
**  output table already exists or not.
*/

DECLARE @Counter INT = 1

WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
    PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'

    DECLARE @FileName NVARCHAR(MAX)
    DECLARE @TableName NVARCHAR(MAX)
    DECLARE @Header NVARCHAR(MAX)
    DECLARE @SQL_Header NVARCHAR(MAX)
    DECLARE @CreateHeader NVARCHAR(MAX) = ''
    DECLARE @SQL_CreateHeader NVARCHAR(MAX)

    SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter

    IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
    BEGIN
/*
**  If the output table does not yet exist, it needs to be created. This requires the list of all
**  columnnames for that table to be retrieved from the first line of the text file, which includes
**  the header. A piece of SQL code is generated and executed which imports the header of the text
**  file. A second temporary table is created which stores this header as a single string.
*/
        PRINT 'Creating new table with name ''' + @TableName + '''.'

        IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
        DROP TABLE [dbo].[Header_Temporary];
        CREATE TABLE [dbo].[Header_Temporary]
        (
            [Header] NVARCHAR(MAX)
        );

        SET @SQL_Header = '
            BULK INSERT [dbo].[Header_Temporary]
            FROM ''' + @Location + @FileName + '''
            WITH
            (
                FIRSTROW = 1,
                LASTROW = 1,
                MAXERRORS = 0,
                FIELDTERMINATOR = ''' + @RowTerminator + ''',
                ROWTERMINATOR = ''' + @RowTerminator + '''
            )'
        EXEC(@SQL_Header)

        SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
        PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
**  The columnnames in the header are separated using the column-terminator. This can be used to loop
**  over each columnname. A new piece of SQL code is generated which will create the output table
**  with the correctly named columns.
*/
        WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
        BEGIN
            SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(MAX), '
            SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header))
        END
        SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(MAX)'

        SET @SQL_CreateHeader = 'CREATE TABLE [ESCO].[' + @TableName + '] (' + @CreateHeader + ')'
        EXEC(@SQL_CreateHeader)
    END

/*
**  Finally, the data from the text file is imported into the newly created table. The first line,
**  including the header information, is skipped. If multiple text files are imported into the same
**  output table, it is essential that the number and the order of the columns is identical, as the
**  table will only be created once, using the header information of the first text file.
*/
    --bulk insert

    PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
    DECLARE @SQL NVARCHAR(MAX)
    SET @SQL = '
        BULK INSERT [dbo].[' + @TableName + ']
        FROM ''' + @Location + @FileName + '''
        WITH
        (
            FIRSTROW = 2,
            MAXERRORS = 0,
            FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
            ROWTERMINATOR = ''' + @RowTerminator + ''',
            CODEPAGE = ''65001'',
            DATAFILETYPE = ''Char'',
            ERRORFILE = ''' + @Location + 'ImportLog.log''
        )'
    EXEC(@SQL)

    SET @Counter = @Counter + 1
END;

-- 1.4  Cleanup temporary tables

/*
**  In this section, the temporary tables which were created and used by this script are deleted.
**  Alternatively, the script could have used 'real' temporary table (identified by the '#' character
**  in front of the name) or a table variable. These would have deleted themselves once they were no
**  longer in use. However, the end result is the same.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];

IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];

On the end disable xp_cmdshell and delete Import folder.

Share:
11,972
LBogaardt
Author by

LBogaardt

Updated on June 05, 2022

Comments

  • LBogaardt
    LBogaardt almost 2 years

    I have multiple delimited text files (e.g. .csv files), each containing columns, rows and a header.

    I want to import all of these input files into SQL Server with as much ease as possible. Specifically, I want to create the output tables into which I will import these files on the fly.

    Some of these input files will need to be imported into one and the same output table, while others will need to be imported to different tables. You can assume that all files which will be imported into the same table have the same header.

    SQL Server Management Studio has an Import Wizard which allows you to import delimited text files (and other formats) and automatically create the output table. However, this does not allow you to import multiple files at the same time. Furthermore, it requires a lot of manual work and is not replicable.

    One can find many scripts online which import multiple text files into a table. However, most of these require the output table to be created first. This too, requires additional work per table.

    Is there a way to list all relevant input files and their corresponding output table, such that the tables are created automatically followed by the importing of the data?

  • Chendur
    Chendur about 7 years
    This is such a well written answer. A small tweak I would suggest is rather than inserting the file and table names manually the below statement could be used to populate the table automatically, ------Insert file names ------ insert into Files_Temporary (filename) exec master..xp_cmdshell 'dir <<Folder path where the files reside>> /b /a-d' ------Update table names eliminating the file extension------- update Files_Temporary set [TableName]= SUBSTRING(filename,0, CHARINDEX('.',filename))