SQL Import skip duplicates

sql duplicates bulkinsert

10,775

Solution 1

Try to bulk insert the data to the temporary table and then SELECT DISTINCT as @madcolor suggested or

INSERT INTO yourTable
SELECT * FROM #tempTable tt
WHERE NOT EXISTS (SELECT 1 FROM youTable yt WHERE yt.id = tt.id)

or other field in WHERE clause.

Solution 2

If you're doing this through some SQL tool like SQL Plus or DBVis or Toad, then I suspect not. If you're doing this programatically in a language, then you need to divide and conquer. Presumably executing an update line by line and catching each exception would be too lengthy a process, so instead you could do a batch operation first on the whole SQL block, and if it fails, do it on the first half, and if that fails, do it on the first half of the first half. Iterate this way until you have a block that succeeds. Discard the block and do the same procedure on the rest of the SQL. Anything that violates a constraint will eventually end up as a sole SQL statement which you know to log and discard. This should import with as much bulk processing as is possible while still throwing out the invalid lines.

10,775

Author by

Karl

Updated on June 04, 2022

Comments

Karl almost 2 years

I am trying to do a bulk upload into a SQL server DB. The source file has duplicates which I want to remove, so I was hoping that the operation would automatically upload the first one, then discard the rest. (I've set a unique key constraint). Problem is, the moment a duplicate upload is attempted the whole thing fails and gets rolled back. Is there any way I can just tell SQL to keep going?