Horrible performance using SqlCommand Async methods with large data

c# .net sql-server asynchronous async-await

16,895

On a system without significant load, an async call has a slightly bigger overhead. While the I/O operation itself is asynchronous regardless, blocking can be faster than thread-pool task switching.

How much overhead? Let's look at your timing numbers. 30ms for a blocking call, 450ms for an asynchronous call. 32 kiB packet size means you need you need about fifty individual I/O operations. That means we have roughly 8ms of overhead on each packet, which corresponds pretty well with your measurements over different packet sizes. That doesn't sound like overhead just from being asynchronous, even though the asynchronous versions need to do a lot more work than the synchronous. It sounds like the synchronous version is (simplified) 1 request -> 50 responses, while the asynchronous version ends up being 1 request -> 1 response -> 1 request -> 1 response -> ..., paying the cost over and over again.

Going deeper. ExecuteReader works just as well as ExecuteReaderAsync. The next operation is Read followed by a GetFieldValue - and an interesting thing happens there. If either of the two is async, the whole operation is slow. So there's certainly something very different happening once you start making things truly asynchronous - a Read will be fast, and then the async GetFieldValueAsync will be slow, or you can start with the slow ReadAsync, and then both GetFieldValue and GetFieldValueAsync are fast. The first asynchronous read from the stream is slow, and the slowness depends entirely on the size of the whole row. If I add more rows of the same size, reading each row takes the same amount of time as if I only have one row, so it's obvious that the data is still being streamed row by row - it just seems to prefer to read the whole row at once once you start any asynchronous read. If I read the first row asynchronously, and the second synchronously - the second row being read will be fast again.

So we can see that the problem is a big size of an individual row and/or column. It doesn't matter how much data you have in total - reading a million small rows asynchronously is just as fast as synchronously. But add just a single field that's too big to fit in a single packet, and you mysteriously incur a cost at asynchronously reading that data - as if each packet needed a separate request packet, and the server couldn't just send all the data at once. Using CommandBehavior.SequentialAccess does improve the performance as expected, but the massive gap between sync and async still exists.

The best performance I got was when doing the whole thing properly. That means using CommandBehavior.SequentialAccess, as well as streaming the data explicitly:

using (var reader = await cmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess))
{
  while (await reader.ReadAsync())
  {
    var data = await reader.GetTextReader(0).ReadToEndAsync();
  }
}

With this, the difference between sync and async becomes hard to measure, and changing the packet size no longer incurs the ridiculous overhead as before.

If you want good performance in edge cases, make sure to use the best tools available - in this case, stream large column data rather than relying on helpers like ExecuteScalar or GetFieldValue.

16,895

hcd

Updated on June 10, 2022

Comments

hcd almost 2 years
I'm having major SQL performance problems when using async calls. I have created a small case to demonstrate the problem.

I have create a database on a SQL Server 2016 which resides in our LAN (so not a localDB).

In that database, I have a table WorkingCopy with 2 columns:
```
Id (nvarchar(255, PK))
Value (nvarchar(max))
```
DDL
```
CREATE TABLE [dbo].[Workingcopy]
(
    [Id] [nvarchar](255) NOT NULL, 
    [Value] [nvarchar](max) NULL, 

    CONSTRAINT [PK_Workingcopy] 
        PRIMARY KEY CLUSTERED ([Id] ASC)
                    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                          IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                          ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
```
In that table, I have inserted a single record (id='PerfUnitTest', Value is a 1.5mb string (a zip of a larger JSON dataset)).

Now, if I execute the query in SSMS :
```
SELECT [Value] 
FROM [Workingcopy] 
WHERE id = 'perfunittest'
```
I immediately get the result, and I see in SQL Servre Profiler that the execution time was around 20 milliseconds. All normal.

When executing the query from .NET (4.6) code using a plain SqlConnection :
```
// at this point, the connection is already open
var command = new SqlCommand($"SELECT Value FROM WorkingCopy WHERE Id = @Id", _connection);
command.Parameters.Add("@Id", SqlDbType.NVarChar, 255).Value = key;

string value = command.ExecuteScalar() as string;
```
The execution time for this is also around 20-30 milliseconds.

But when changing it to async code :
```
string value = await command.ExecuteScalarAsync() as string;
```
The execution time is suddenly 1800 ms ! Also in SQL Server Profiler, I see that the query execution duration is more than a second. Although the executed query reported by the profiler is exactly the same as the non-Async version.

But it gets worse. If I play around with the Packet Size in the connection string, I get the following results :

Packet size 32768 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 450 ms

Packet Size 4096 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 3667 ms

Packet size 512 : [TIMING]: ExecuteScalarAsync in SqlValueStore -> elapsed time : 30776 ms

30,000 ms!! That's over a 1000x slower than the non-async version. And SQL Server Profiler reports that the query execution took over 10 seconds. That doesn't even explain where the other 20 seconds are gone to!

Then I've switched back to the sync version and also played around with the Packet Size, and although it did impact a little the execution time, it was nowhere as dramatic as with the async version.

As a sidenote, if it put just a small string (< 100bytes) into the value, the async query execution is just as fast as the sync version (result in 1 or 2 ms).

I'm really baffled by this, especially since I'm using the built-in SqlConnection, not even an ORM. Also when searching around, I found nothing which could explain this behavior. Any ideas?
- Panagiotis Kanavos about 7 years
  
  Post the query. If the query is slow, you should check the query for problems. Async methods aren't slow - they don't affect the query execution time at all.
- Panagiotis Kanavos about 7 years
  
  Post your query, the execution plan and the schema of the underlying table(s). Are there any indexes? How many rows are involved? What does the execution plan show? Are you trying to query a 'varchar' with an 'nvarchar` parameter?
- Panagiotis Kanavos about 7 years
  
  BTW ExecuteScalar doesn't mean that a TOP 1 query will be executed. If your query returns multiple values, it wil still send all of them to the client. The client though will simply keep the first one and discard the rest. That's probably why reducing the packet size increases the time - your query has to return a lot more packets
- Fildor about 7 years
  
  But he states, he has only one row in the table @PanagiotisKanavos
- Panagiotis Kanavos about 7 years
  
  @Fildor does he? ExecuteScalar doesn't mean one row, it means read only one of the returned rows. The use of a key-value table means that there could be duplicates - which is why key-value tables are considered a very bad design.
- Fildor about 7 years
  
  @PanagiotisKanavos "In that table, I have inserted 1 record" ... I read that as "I have only one row in that table" Can you confirm that hcd?
- Panagiotis Kanavos about 7 years
  
  @Fildor yet the actual evidence of the packet sizes and timings shows that there are a lot of packets to move
- hcd about 7 years
  
  The schema is SUPER simple : 2 columns Id and Value : `CREATE TABLE [dbo].[Workingcopy]( [Id] [nvarchar](255) NOT NULL, [Value] [nvarchar](max) NULL, CONSTRAINT [PK_Workingcopy] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] And in the table there is only 1 record. Albeit one with a large "Value" (1.5mb)
- Panagiotis Kanavos about 7 years
  
  @hcd no repro then. Where are the indexes? How many key/values have you stored in there? What is the execution plan? Post code that can actually reproduce the problem - that means post the creation script and the script that inserts the data
- Fildor about 7 years
  
  @PanagiotisKanavos Yes, it's a 1.5 MB value ... But still, that was only my interpretation of that cited sentence. I could be mistaken.
- Panagiotis Kanavos about 7 years
  
  @hcd 1.5 MB ????? And you ask why retrieving that gets slower with decreasing packet sizes? Especially when you use the wrong query for BLOBs?
- Magnus about 7 years
  
  Any difference if you do command.ExecuteScalarAsync().Result instead of await ?
- Fildor about 7 years
  
  @PanagiotisKanavos That was just playing around on behalf of OP. The actual question is why async is so much slower in comparison to sync with same package size.
- hcd about 7 years
  
  Yes, only 1 record in the whole table !. And no, I am not suprised it takes slower with smaller packages sizes. What I am surprised about is that with the sync version it takes 30ms, and with the async version it takes 30 seconds for the same package size ! (1000x slower)
- Igor about 7 years
  
  @hcd - when adding to the question (your DDL in this case) please update the question instead of adding it to the comments. I have done it for you this time.
- hcd about 7 years
  
  @Magnus : same timings when using .Result instead of await ...
- Igor about 7 years
  
  @hcd - also please update the question for column id. In your original question you state its varchar but your ddl and query has it listed as nvarchar. Could you please correct it?
- Panagiotis Kanavos about 7 years
  
  Check Modifying Large-Value (max) Data in ADO.NET for the correct way of retrieving CLOBs and BLOBs. Instead of trying to read them as one big value, use GetSqlChars or GetSqlBinary to retrieve them in a streaming fashion. Also consider storing them as FILESTREAM data - there's no reason to save 1.5MB of data in a table's data page
- Fildor about 7 years
  
  @PanagiotisKanavos While that is true, it would still be interesting to know the reason for the huge difference in timing when you would expect it to perform rather similarly. At least I would.
- Panagiotis Kanavos about 7 years
  
  @Fildor because async isn't slower. The OP is loading a 1.5MB BLOB in an inefficient way. ExecuteReaderAsync doesn't mean that the data will be streamed, it means that once the data gets retrieved in the same way as before, it will become available to the original context. The 30 sec delay appears only when the OP reduces the packet size, resulting in a lot more network traffic
- Panagiotis Kanavos about 7 years
  
  @Fildor the OP changed the packet size. That means, a lot more packets have to go back and forth - 64 times more. 64 * 0.450 = 28.8 seconds
- Fildor about 7 years
  
  @PanagiotisKanavos That's not correct. OP writes sync : 20-30 ms and async with everything else same 1800 ms. The effect of changing the packet size is totally clear and expected.
- hcd about 7 years
  
  @PanagiotisKanavos Well actually, the SQL profiler shows that the execution time for the async version is way slower than the sync version. And this is what puzzles me. I'm trying to figure out now how i can post screenshots here :)
- Igor about 7 years
  
  @hcd - there is a toolbar button to upload an image in your question (you have to edit the question to do this).
- Luaan about 7 years
  
  Can you include the connection string (without the password if any, of course)? Is there something non-standard about the endpoint configuration?
- Kuba Wyrostek about 7 years
  
  @hcd it seems that you could remove the part about your attempts to alter package sizes since it seem irrelevant to the problem and causes confusion among some commenters.
- Admin about 7 years
  
  Async is always going to add more time to a call, and isn't really advisable for operations that return quickly. The time it takes to spin up the task, get it scheduled, and start execution all increase the time to return the result. Having said this, the timings observed are definitely crazy.
- Paulo Morgado about 7 years
  
  Is there a synchronization context present when the code is executing? If so, is ConfigureAwait being used everywhere it should?
- Zenilogix about 7 years
  
  Wondering if the increase in elapsed time is an artifact of threading and thread scheduling. The overall task has been split between two threads - the application's thread and the worker thread asynchronously executing the query. They have to synchronize/rendezvous to exchange the result data, which is going to add some overhead, but there can also be latency - each thread doing a whole lot of extra waiting.
- hcd about 7 years
  
  Thnx for all the comments. I've just verified and accepted the answer below from @Luaan.
Wiktor Zychla about 7 years

Great answer. Reproduced the OP's scenario. For this 1.5m string OP is mentioning, I get 130ms for the sync version vs 2200ms for async. With your approach, the measured time for the 1.5m string is 60ms, not bad.
Adam Houldsworth about 7 years

Good investigations there, plus I learnt a handful of other tuning techniques for our DAL code.
hcd about 7 years

Just got back to the office and tried the code on my example instead of the ExecuteScalarAsync, but I still got 30seconds execution time with a 512byte packet size :(
hcd about 7 years

Aha, it did work after all :) But I have to add the CommandBehavior.SequentialAccess to this line : using (var reader = await command.ExecuteReaderAsync(CommandBehavior.SequentialAccess)‌)
Luaan about 7 years

@hcd My bad, I had it in the text but not in the sample code :)
Krishna Gupta almost 4 years

The question is meaningful, but this answer is a gem! And a beautiful line: "If you want good performance in edge cases, make sure to use the best tools available". Thanks!
Bharat Bhushan almost 3 years

its working fine for single value, but how can we read/get complete row, as there are multiple scenario where we need to get multiple columns from multiple tables using join? Thanks in advance!
Luaan almost 3 years

@BharatBhushan That's what the argument to (in this case) GetTextReader is - it's the ordinal of the column in the result set you want :)