Break up a SQL Server 2008 query into batches
Solution 1
SET NOCOUNT ON;
CREATE TABLE [dbo].[SyncAudit] ( PkId INT, BatchNumber INT)
DECLARE @batchsize INT
,@rowcount INT
,@batchcount INT
,@rootdir VARCHAR(2048)
,@saveas VARCHAR(2048)
,@query VARCHAR(2048)
,@bcpquery VARCHAR(2048)
,@bcpconn VARCHAR(64)
,@bcpdelim VARCHAR(2)
SET @rootdir = '\\SERVER1\SHARE1\FOLDER\'
SET @batchsize = 2000
SET @bcpdelim = '|'
SET @bcpconn = '-T' -- Trusted
--SET @bcpconn = '-U <username> -P <password>' -- SQL authentication
SELECT @rowcount = COUNT(1),
@batchcount = CEILING(COUNT(1)/@batchsize) FROM <@TableName, string, 'SomeTable'>
SELECT [BatchSize] = @BatchSize, [BatchCount] = @Batchcount
INSERT INTO SyncAudit
SELECT
<@TableKey, string, 'PKField'>
,groupnum = NTILE(@batchcount) OVER ( ORDER BY <@TableKey, string, 'PKField'>)
FROM
<@TableName, string, 'SomeTable'>
WHILE (@batchcount > 0)
BEGIN
SET @saveas = @rootdir + 'batchnumber-' + cast(@batchcount as varchar) + '.txt'
SET @query = ' SELECT [<@TableName, string, 'SomeTable'>].*
FROM [' + db_name() + '].[dbo].[<@TableName, string, 'SomeTable'>]
JOIN [' + db_name() + '].[dbo].[SyncAudit]
ON [<@TableName, string, 'SomeTable'>].<@TableKey, string, 'PKField'> = [SyncAudit].PkId
AND [SyncAudit].BatchNumber = ' + cast(@batchcount as varchar) + ''
SET @bcpquery = 'bcp "' + replace(@query, char(10), '') + '" QUERYOUT "' + @saveas + '" -c -t^' + @bcpdelim + ' ' + @bcpconn + ' -S ' + @@servername
EXEC master..xp_cmdshell @bcpquery
--EXEC (@query)
SET @batchcount = @batchcount -1
END
DROP TABLE [dbo].[SyncAudit] -- or leave for reference
Solution 2
I think you can take advantage of using ROW_NUMBER and then using BETWEEN to specify a range of rows that you like. Alternatively you could use PKID if you knew there wasn't gaps, or didn't care about the gaps
e.g.
SELECT ...
FROM
(SELECT ...
ROW_NUMBER() OVER(ORDER BY PKID ) as RowNum
FROM Sometable e
) t
WHERE RowNum BETWEEN @startRowIndex AND (@startRowIndex + @maximumRows) - 1
This is often used for paging results. 4GuysFromRolla have a good article on it
Solution 3
You could work out the ranges in a while @@ROWCOUNT loop to target the rows required. It may work better than ROW_NUMBER() which would have to keep numbering from the start.
declare @startid int
declare @endid int
-- get one range, these are efficient as they go over the PKID key by range
select top(1) @startid = pkid from sometable order by pkid -- 1 key visited
select top(2000) @endid = pkid from sometable order by pkid -- 2000 keys visited
-- note: top 2000 may end up with the 514th id if that is the last one
while @@ROWCOUNT > 0
begin
insert otherdb.dbo.backupcopy
select * from sometable
where pkid between @startid and @endid
select top(1) @startid = pkid from sometable
WHERE pkid > @endid -- binary locate
order by pkid
select top(2000) @endid = pkid from sometable
WHERE pkid > @endid -- binary locate, then forward range lookup, max 2000 keys
order by pkid
end
Mass Dot Net
Updated on June 04, 2022Comments
-
Mass Dot Net about 2 years
I'm trying to prepare some data for deletion by a 3rd party, and unfortunately they can only process data in batches of 2000 records. I have 100k records and may need to divide-and-export this data several more times, so I'd like to automate the process somehow.
Is there a reasonably easy way to do this using SQL Server 2008? I'm not running a complex query -- it's not too far off from
SELECT PKID FROM Sometable ORDER BY PKID
-- and while I can probably do this using a cursor, I'd like to know if there's a better way.