How do I optimize Upsert (Update and Insert) operation within SSIS package?
The two things I'd look at are your inserts (ensure you are using either the "Table or View - fast load" or "Table name or view name variable - fast load") and your updates.
As you have correctly determined, the update logic is usually where performance falls down and that is due to the OLE DB component firing singleton updates for each row flowing through it. The usual approach people take to overcome this is to write all the updates to a staging table, much as your Insert logic does. Then follow up your Data Flow Task
with an Execute SQL Task
to perform a bulk Update.
If you are in the mind of acquiring 3rd party tools, PragmaticWorks offers an Upsert destination
TCCV
Someone that asks questions that they should already know the answers to - and will probably figure it out 10 minutes after posting :)
Updated on August 07, 2022Comments
-
TCCV over 1 year
I am not a DBA but I do work for a small company as the IT person. I have to replicate a database from staging to production. I have created an SSIS package to do this but it takes hours to run. This isn't a large data warehouse type of project, either, it's a pretty straightforward
Upsert
. I'm assuming that I am the weak link in how I designed it.Here's my procedure:
- Truncate staging tables (
EXECUTE SQL TASK
) - Pull data from a development table into staging (
Data Flow Task
) - Run a data flow task
OLE DB Source
Conditional Split Transformation
(Condition used:[!]ISNULL(is_new_flag)
)- If new insert, if existing update
The data flow task is mimicked a few times to change tables/values but the flow is the same. I've read several things about OLE DB components being slow to updates being slow and have tried a few things but haven't gotten it to run very quickly.
I'm not sure what other details to give, but I can give anything that's asked for.
- Truncate staging tables (
-
TCCV about 11 yearsThis is interesting. To be a bit dumb, how would I go about doing the batch update? I only know of the singletons.
-
billinkc about 11 yearsThe OLE DB Component doesn't offer a batch update option. If you use it, you're firing single statements. The OLE DB Destination can also operate in the same fashion if you do not select the fast load option. I'll try and update with pictures after these meetings
-
TCCV about 11 yearsI think I get it, It'll be a bit before I can test it, but I think this'll do it.
-
billinkc about 11 yearsPersonally, I'd accept Siva's answer. Far more detailed as this meeting is sucking my will to live
-
Edmund Schweppe about 11 years@billinkc I wish there was some way we could downvote your meeting :-(
-
TCCV about 11 yearsYou both deserve major Kudos. I just tested it with one data flow and it's screaming fast compared to before. I gave you both upvotes but I didn't see Siva's answer until now, so I'll keep the check with billinkc