How do you deal with transport-level errors in SqlConnection?

24,279

Solution 1

I posted an answer on another question on another topic that might have some use here. That answer involved SMB connections, not SQL. However it was identical in that it involved a low-level transport error.

What we found was that in a heavy load situation, it was fairly easy for the remote server to time out connections at the TCP layer simply because the server was busy. Part of the reason was the defaults for how many times TCP will retransmit data on Windows weren't appropriate for our situation.

Take a look at the registry settings for tuning TCP/IP on Windows. In particular you want to look at TcpMaxDataRetransmissions and maybe TcpMaxConnectRetransmissions. These default to 5 and 2 respectively, try upping them a little bit on the client system and duplicate the load situation.

Don't go crazy! TCP doubles the timeout with each successive retransmission, so the timeout behavior for bad connections can go exponential on you if you increase these too much. As I recall upping TcpMaxDataRetransmissions to 6 or 7 solved our problem in the vast majority of cases.

Solution 2

This blog post by Michael Aspengren explains the error message "A transport-level error has occurred when sending the request to the server."

Solution 3

To answer your original question:

A more elegant way to detect this particular error, without parsing the error message, is to inspect the Number property of the SqlException.

(This actually returns the error number from the first SqlError in the Errors collection, but in your case the transport error should be the only one in the collection.)

Solution 4

use Enterprise Services with transactional components

Solution 5

I have seen this happen in my own environment a number of times. The client application in this case is installed on many machines. Some of those machines happen to be laptops people were leaving the application open disconnecting it and then plugging it back in and attempting to use it. This will then cause the error you have mentioned.

My first point would be to look at the network and ensure that servers aren't on DHCP and renewing IP Addresses causing this error. If that isn't the case then you have to start trawlling through your event logs looking for other network related.

Unfortunately it is as stated above a network error. The main thing you can do is just monitor the connections using a tool like netmon and work back from there.

Good Luck.

Share:
24,279
Eric Z Beard
Author by

Eric Z Beard

I recently moved from Florida to Seattle, Washington, where I am working for Amazon as a solutions architect in the AWS partner network.

Updated on May 11, 2020

Comments

  • Eric Z Beard
    Eric Z Beard about 4 years

    Every now and then in a high volume .NET application, you might see this exception when you try to execute a query:

    System.Data.SqlClient.SqlException: A transport-level error has occurred when sending the request to the server.

    According to my research, this is something that "just happens" and not much can be done to prevent it. It does not happen as a result of a bad query, and generally cannot be duplicated. It just crops up maybe once every few days in a busy OLTP system when the TCP connection to the database goes bad for some reason.

    I am forced to detect this error by parsing the exception message, and then retrying the entire operation from scratch, to include using a new connection. None of that is pretty.

    Anybody have any alternate solutions?