Horrible performance using SqlCommand Async methods with large data

16,895

On a system without significant load, an async call has a slightly bigger overhead. While the I/O operation itself is asynchronous regardless, blocking can be faster than thread-pool task switching.

How much overhead? Let's look at your timing numbers. 30ms for a blocking call, 450ms for an asynchronous call. 32 kiB packet size means you need you need about fifty individual I/O operations. That means we have roughly 8ms of overhead on each packet, which corresponds pretty well with your measurements over different packet sizes. That doesn't sound like overhead just from being asynchronous, even though the asynchronous versions need to do a lot more work than the synchronous. It sounds like the synchronous version is (simplified) 1 request -> 50 responses, while the asynchronous version ends up being 1 request -> 1 response -> 1 request -> 1 response -> ..., paying the cost over and over again.

Going deeper. ExecuteReader works just as well as ExecuteReaderAsync. The next operation is Read followed by a GetFieldValue - and an interesting thing happens there. If either of the two is async, the whole operation is slow. So there's certainly something very different happening once you start making things truly asynchronous - a Read will be fast, and then the async GetFieldValueAsync will be slow, or you can start with the slow ReadAsync, and then both GetFieldValue and GetFieldValueAsync are fast. The first asynchronous read from the stream is slow, and the slowness depends entirely on the size of the whole row. If I add more rows of the same size, reading each row takes the same amount of time as if I only have one row, so it's obvious that the data is still being streamed row by row - it just seems to prefer to read the whole row at once once you start any asynchronous read. If I read the first row asynchronously, and the second synchronously - the second row being read will be fast again.

So we can see that the problem is a big size of an individual row and/or column. It doesn't matter how much data you have in total - reading a million small rows asynchronously is just as fast as synchronously. But add just a single field that's too big to fit in a single packet, and you mysteriously incur a cost at asynchronously reading that data - as if each packet needed a separate request packet, and the server couldn't just send all the data at once. Using CommandBehavior.SequentialAccess does improve the performance as expected, but the massive gap between sync and async still exists.

The best performance I got was when doing the whole thing properly. That means using CommandBehavior.SequentialAccess, as well as streaming the data explicitly:

using (var reader = await cmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess))
{
  while (await reader.ReadAsync())
  {
    var data = await reader.GetTextReader(0).ReadToEndAsync();
  }
}

With this, the difference between sync and async becomes hard to measure, and changing the packet size no longer incurs the ridiculous overhead as before.

If you want good performance in edge cases, make sure to use the best tools available - in this case, stream large column data rather than relying on helpers like ExecuteScalar or GetFieldValue.

Share:
16,895

Related videos on Youtube

hcd
Author by

hcd

Updated on June 10, 2022

Comments

  • hcd
    hcd almost 2 years

    I'm having major SQL performance problems when using async calls. I have created a small case to demonstrate the problem.

    I have create a database on a SQL Server 2016 which resides in our LAN (so not a localDB).

    In that database, I have a table WorkingCopy with 2 columns:

    Id (nvarchar(255, PK))
    Value (nvarchar(max))
    

    DDL

    CREATE TABLE [dbo].[Workingcopy]
    (
        [Id] [nvarchar](255) NOT NULL, 
        [Value] [nvarchar](max) NULL, 
    
        CONSTRAINT [PK_Workingcopy] 
            PRIMARY KEY CLUSTERED ([Id] ASC)
                        WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                              IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                              ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
    

    In that table, I have inserted a single record (id='PerfUnitTest', Value is a 1.5mb string (a zip of a larger JSON dataset)).

    Now, if I execute the query in SSMS :

    SELECT [Value] 
    FROM [Workingcopy] 
    WHERE id = 'perfunittest'
    

    I immediately get the result, and I see in SQL Servre Profiler that the execution time was around 20 milliseconds. All normal.

    When executing the query from .NET (4.6) code using a plain SqlConnection :

    // at this point, the connection is already open
    var command = new SqlCommand($"SELECT Value FROM WorkingCopy WHERE Id = @Id", _connection);
    command.Parameters.Add("@Id", SqlDbType.NVarChar, 255).Value = key;
    
    string value = command.ExecuteScalar() as string;
    

    The execution time for this is also around 20-30 milliseconds.

    But when changing it to async code :

    string value = await command.ExecuteScalarAsync() as string;
    

    The execution time is suddenly 1800 ms ! Also in SQL Server Profiler, I see that the query execution duration is more than a second. Although the executed query reported by the profiler is exactly the same as the non-Async version.

    But it gets worse. If I play around with the Packet Size in the connection string, I get the following results :

    Packet size 32768 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 450 ms

    Packet Size 4096 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 3667 ms

    Packet size 512 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 30776 ms

    30,000 ms!! That's over a 1000x slower than the non-async version. And SQL Server Profiler reports that the query execution took over 10 seconds. That doesn't even explain where the other 20 seconds are gone to!

    Then I've switched back to the sync version and also played around with the Packet Size, and although it did impact a little the execution time, it was nowhere as dramatic as with the async version.

    As a sidenote, if it put just a small string (< 100bytes) into the value, the async query execution is just as fast as the sync version (result in 1 or 2 ms).

    I'm really baffled by this, especially since I'm using the built-in SqlConnection, not even an ORM. Also when searching around, I found nothing which could explain this behavior. Any ideas?

    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      Post the query. If the query is slow, you should check the query for problems. Async methods aren't slow - they don't affect the query execution time at all.
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      Post your query, the execution plan and the schema of the underlying table(s). Are there any indexes? How many rows are involved? What does the execution plan show? Are you trying to query a 'varchar' with an 'nvarchar` parameter?
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      BTW ExecuteScalar doesn't mean that a TOP 1 query will be executed. If your query returns multiple values, it wil still send all of them to the client. The client though will simply keep the first one and discard the rest. That's probably why reducing the packet size increases the time - your query has to return a lot more packets
    • Fildor
      Fildor about 7 years
      But he states, he has only one row in the table @PanagiotisKanavos
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @Fildor does he? ExecuteScalar doesn't mean one row, it means read only one of the returned rows. The use of a key-value table means that there could be duplicates - which is why key-value tables are considered a very bad design.
    • Fildor
      Fildor about 7 years
      @PanagiotisKanavos "In that table, I have inserted 1 record" ... I read that as "I have only one row in that table" Can you confirm that hcd?
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @Fildor yet the actual evidence of the packet sizes and timings shows that there are a lot of packets to move
    • hcd
      hcd about 7 years
      The schema is SUPER simple : 2 columns Id and Value : `CREATE TABLE [dbo].[Workingcopy]( [Id] [nvarchar](255) NOT NULL, [Value] [nvarchar](max) NULL, CONSTRAINT [PK_Workingcopy] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] And in the table there is only 1 record. Albeit one with a large "Value" (1.5mb)
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @hcd no repro then. Where are the indexes? How many key/values have you stored in there? What is the execution plan? Post code that can actually reproduce the problem - that means post the creation script and the script that inserts the data
    • Fildor
      Fildor about 7 years
      @PanagiotisKanavos Yes, it's a 1.5 MB value ... But still, that was only my interpretation of that cited sentence. I could be mistaken.
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @hcd 1.5 MB ????? And you ask why retrieving that gets slower with decreasing packet sizes? Especially when you use the wrong query for BLOBs?
    • Magnus
      Magnus about 7 years
      Any difference if you do command.ExecuteScalarAsync().Result instead of await ?
    • Fildor
      Fildor about 7 years
      @PanagiotisKanavos That was just playing around on behalf of OP. The actual question is why async is so much slower in comparison to sync with same package size.
    • hcd
      hcd about 7 years
      Yes, only 1 record in the whole table !. And no, I am not suprised it takes slower with smaller packages sizes. What I am surprised about is that with the sync version it takes 30ms, and with the async version it takes 30 seconds for the same package size ! (1000x slower)
    • Igor
      Igor about 7 years
      @hcd - when adding to the question (your DDL in this case) please update the question instead of adding it to the comments. I have done it for you this time.
    • hcd
      hcd about 7 years
      @Magnus : same timings when using .Result instead of await ...
    • Igor
      Igor about 7 years
      @hcd - also please update the question for column id. In your original question you state its varchar but your ddl and query has it listed as nvarchar. Could you please correct it?
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      Check Modifying Large-Value (max) Data in ADO.NET for the correct way of retrieving CLOBs and BLOBs. Instead of trying to read them as one big value, use GetSqlChars or GetSqlBinary to retrieve them in a streaming fashion. Also consider storing them as FILESTREAM data - there's no reason to save 1.5MB of data in a table's data page
    • Fildor
      Fildor about 7 years
      @PanagiotisKanavos While that is true, it would still be interesting to know the reason for the huge difference in timing when you would expect it to perform rather similarly. At least I would.
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @Fildor because async isn't slower. The OP is loading a 1.5MB BLOB in an inefficient way. ExecuteReaderAsync doesn't mean that the data will be streamed, it means that once the data gets retrieved in the same way as before, it will become available to the original context. The 30 sec delay appears only when the OP reduces the packet size, resulting in a lot more network traffic
    • Panagiotis Kanavos
      Panagiotis Kanavos about 7 years
      @Fildor the OP changed the packet size. That means, a lot more packets have to go back and forth - 64 times more. 64 * 0.450 = 28.8 seconds
    • Fildor
      Fildor about 7 years
      @PanagiotisKanavos That's not correct. OP writes sync : 20-30 ms and async with everything else same 1800 ms. The effect of changing the packet size is totally clear and expected.
    • hcd
      hcd about 7 years
      @PanagiotisKanavos Well actually, the SQL profiler shows that the execution time for the async version is way slower than the sync version. And this is what puzzles me. I'm trying to figure out now how i can post screenshots here :)
    • Igor
      Igor about 7 years
      @hcd - there is a toolbar button to upload an image in your question (you have to edit the question to do this).
    • Luaan
      Luaan about 7 years
      Can you include the connection string (without the password if any, of course)? Is there something non-standard about the endpoint configuration?
    • Kuba Wyrostek
      Kuba Wyrostek about 7 years
      @hcd it seems that you could remove the part about your attempts to alter package sizes since it seem irrelevant to the problem and causes confusion among some commenters.
    • Admin
      Admin about 7 years
      Async is always going to add more time to a call, and isn't really advisable for operations that return quickly. The time it takes to spin up the task, get it scheduled, and start execution all increase the time to return the result. Having said this, the timings observed are definitely crazy.
    • Paulo Morgado
      Paulo Morgado about 7 years
      Is there a synchronization context present when the code is executing? If so, is ConfigureAwait being used everywhere it should?
    • Zenilogix
      Zenilogix about 7 years
      Wondering if the increase in elapsed time is an artifact of threading and thread scheduling. The overall task has been split between two threads - the application's thread and the worker thread asynchronously executing the query. They have to synchronize/rendezvous to exchange the result data, which is going to add some overhead, but there can also be latency - each thread doing a whole lot of extra waiting.
    • hcd
      hcd about 7 years
      Thnx for all the comments. I've just verified and accepted the answer below from @Luaan.
  • Wiktor Zychla
    Wiktor Zychla about 7 years
    Great answer. Reproduced the OP's scenario. For this 1.5m string OP is mentioning, I get 130ms for the sync version vs 2200ms for async. With your approach, the measured time for the 1.5m string is 60ms, not bad.
  • Adam Houldsworth
    Adam Houldsworth about 7 years
    Good investigations there, plus I learnt a handful of other tuning techniques for our DAL code.
  • hcd
    hcd about 7 years
    Just got back to the office and tried the code on my example instead of the ExecuteScalarAsync, but I still got 30seconds execution time with a 512byte packet size :(
  • hcd
    hcd about 7 years
    Aha, it did work after all :) But I have to add the CommandBehavior.SequentialAccess to this line : using (var reader = await command.ExecuteReaderAsync(CommandBehavior.SequentialAccess)‌​)
  • Luaan
    Luaan about 7 years
    @hcd My bad, I had it in the text but not in the sample code :)
  • Krishna Gupta
    Krishna Gupta almost 4 years
    The question is meaningful, but this answer is a gem! And a beautiful line: "If you want good performance in edge cases, make sure to use the best tools available". Thanks!
  • Bharat Bhushan
    Bharat Bhushan almost 3 years
    its working fine for single value, but how can we read/get complete row, as there are multiple scenario where we need to get multiple columns from multiple tables using join? Thanks in advance!
  • Luaan
    Luaan almost 3 years
    @BharatBhushan That's what the argument to (in this case) GetTextReader is - it's the ordinal of the column in the result set you want :)