KeepAlive with WCF and TCP?

c# wcf tcp keep-alive

16,308

No. This is widely misunderstood, and unfortunately there is much misinformation out there.

First, "Infinite" is a sort of semi-valid value. There is are two special config serializers that convert "Infinite" to either TimeSpan.MaxValue or int.MaxValue (so they're not really "infinite" anyways), but not everything in WCF seems to recognize this. So it's always best to specify your timeouts explicitly with time values.

Second, you don't need a "keepalive" method in your service, since WCF provides what's called a "reliable session". If you add <reliableSession enabled="true" /> then WCF will provide it's own keep alive mechanism through "infrastructure messages".

By having your own "keepalive" mechanism, you're effectively doubling the load on your service and you can actually create more problems than it solves.

Third, when using a reliable session, you use the inactivityTimeout setting of reliableSession. This does two things. First, it controls how frequently infrastructure (keepalive) messages are sent. They are sent at half the timeout value, so if you set it to 18 minutes, then they will be sent every 9 minutes. Secondly, if no infrastructure or operation messages (ie messages that are part of your data contract) are received within the inactivity timeout, the connection is aborted because there has likely been a problem (one side has crashed, there's a network problem, etc..).

receiveTimeout is the maximum amount of time in which no operation messages can be received before the connection is aborted (the default is 10 minutes). Setting this to a large value (Int32.MaxValue is somewhere in the vicinity of 24 days) keeps the connection tacked up, setting inactivityTimeout to a smaller value (again, the default is 10 minutes) (to a time that is smaller than 2x the maximum amount of time before network routers will drop a connection from inactivity) keeps the connection alive.

WCF handles all this for you. You can then simply subscribe to the Connection Aborted messages to know when the connection is dropped for real reasons (app crashes, network timeouts, clients losing power, etc..) and allows you to recreate the connections.

Additionally, if you don't need ordered messages, set ordered="false", as this greatly reduces the overhead of reliable sessions. The default is true.

Note: You may not receive a connection aborted event until the inactivityTimeout has expired (or you try to use the connection). Be aware of this, and set your timeouts accordingly.

Most recommendations on the internet are to set both receiveTimeout and inactivityTimeout to Infinite. This has two problems, first infrastructure messages don't get sent in a timely manner, so routers will drop the connection... forcing you to do your own keepalives. Second, the large inactivity timeout means it won't recognize when a connection legitimately drops, and you have to rely on on that ping aborting to know when a failure occurs. This is all completely unnecessary, and can in fact even make your service even more unreliable.

16,308

Author by

Banshee

Updated on June 07, 2022

Comments

Banshee almost 2 years

I have a Windows Service hosting an advanced WCF service that communicates over TCP(netTCP) with protobuf.net, some times also with certificates.

The receiveTimeout is set to infinite to never drop the connection due to inactivity. But from what I understand the connection could be dropped anyway so I have created a simple two way keepalive service method that the client is calling every 9 min to keep the connection alive. It's very important that the connection never breaks.

Is this the correct way? Or could I simply remove my keep live because the receiveTimout is set to infinite?

Edit : Current app.config for WCF service : http://1drv.ms/1uEVKIt
Banshee over 9 years

Thanks!From what I read the reliableSession does same as the TCP protocol itself but on another level, this is why we have set it to enabled=false.If Im not using reliableSession I will need my own KeepAlive. This KeepAlive is vary simple, all it does is to make a call to a empty service method every, lets say 9 min just to make sure that the connection is live. If it faileds, then close application.Where can I read more about all that information you posted?I have looked for this before but never found it.Im still not sure what settings to set exacly and what overhead it could creat?
Erik Funkenbusch over 9 years

@Banshee - The reliable session, if properly configured as I've mentioned will be the least overhead, and most reliable. By turning it off and doing it yourself, you're forcing what is essentially a low-level function into your application domain. By doing it yourself you lose important metadata, tracing and diagnostics reporting, and perfmon statistics. In other words, you're using a high performance race car, and ripping out the engine and putting a go-cart engine in it, then wondering why it doesn't perform.
Erik Funkenbusch over 9 years

@Banshee - As I said, this is not well documented, and there is a lot of misinformation out there. What I've said here is hard learned by years of trial and error, tracing into the source code, and reading various contradictory documentation and figuring out what's missing. reliable session, configured as I've shown here, works very well. But you have to understand exactly what its doing because the documentation is incomplete, misleading, wrong, or non-existent.
Banshee over 9 years

I have attached a link to my app.config in my first post(edit), could you please take a look if the current configuration will work well with reliableSession enabled. If there might be problems, pleas point it out. The config is a bit complicated but I hope you understand the setup(note that we are also using certificate in some cases).
Erik Funkenbusch over 9 years

@Banshee - no. You're still setting inactivityTimeout and recieveTimeout to infinite. You should set inactivityTimeout to either the maximimum amount of time you want to expire before the connection is noticed as being down, or twice the value of the ping time you want (bear in mind that the smaller this value, the more frequent pings will occur and if you have a lot of clients connected, then that increases load from pinging, so setting it to 1 second is impractical). You should also set the receiveTimeout to an actual time value, such as 24.00:00:00 which is 24 hours.
Erik Funkenbusch over 9 years

@Banshee - Infinite isn't really infinite anyways, it's a little over 24 hours anyways. But setting it to an actual time value makes this clear, and it works better for the cases that don't understand the infinite value. If you need it to stay up 24/7 then you just reconnect on abort. If you REALLY need it to stay connected uninterrupted (short of a network or app/computer crash), make sure there is at least one message sent within 24 hours.
Banshee over 9 years

So what you are saying is that I should set it up as you explained but also keep my keepAlive service method and call it once every for 23 hour? I dont have a reconnect function right now, I hade before but it was to much trouble so its really important that the connection never dies becouse of timeout.
Erik Funkenbusch over 9 years

@Banshee - Sorry, I meant 24 days, not 24 hours. You can't prevent disconnects from occurring. There are always going to be situations that are beyond your control, such as network routing glitches.. Keeping a connection up, uninterrupted, is highly unlikely the longer you keep it open. Eventually, some condition will occur, even if it's just installing patches and rebooting the server. You still need a way to reconnect when the connection drops. If you don't make a single call across the connection in 24 days, I have to doubt how important keeping it up is.
motoDrizzt over 7 years

Dvoted. This answer is heavily wrong in every possible way, totally misleading and ignoring everything that it is documented by Microsoft about reliable sessions. What are you suggesting is to use a really heavyweight feature that's not meant to be a keep alive substitute at all, in fact it doesn't even work the way you mentioned. Reliable sessions are an expensive layer to be used on unreliable communication's channels, not to be used as a poor substitute for an application level keep alive, and this reflect in your explanation in which you put together inactivityTimeout and receiveTimeout.
motoDrizzt over 7 years

ReceiveTimeout is the WCF official parameter to control the time of inactivity before the session get dropped, period. Reliable sessions works by granting that the connection has not been lost due to physical problems like SOAP routers, HTTP packet drops, and alike. If an application has not to be deployed on problematic communication channels, enabling reliable sessions is just a huge overhead, and then you still have to rely on receiveTimeout to keep the connection from closing. This answer should be deleted, because as it stands it suggests to do something heavily wrong.
Erik Funkenbusch over 7 years

@motoDrizzt - You are totally incorrect. In fact, all the information I presented has been pulled directly from the MSDN documentation. While it is arguable whether or not such a mechanism should be used for long term connections, the fact of the matter is that the information I have presented is 100% factually correct. For instance, your remarks about ReceiveTimeout is contradicted by the MSDN documentation. I suggest you read the remarks in the MSDN msdn.microsoft.com/en-us/library/…
Kyberias about 7 years

From MSDN documentation: TimeSpan.MaxValue is actually 10,675,199 days (30 000 years).
Erik Funkenbusch about 7 years

@Kyberias - Correct, I mistyped above. It should be Int32.MaxValue, not TimeSpan.MaxValue. The maximum value for receiveTimeout is 24.20:31:23.6470000. This is the Int32.MaxValue, and represents the number of milliseconds maximum, which if you calculate (2^31) / 1000 / 60 / 60 / 24 = the number above.
Michael Freidgeim over 5 years

I couldn't find reliableSession enabled property in docs.microsoft.com/en-us/dotnet/framework/configure-apps/…. Is it not supported any more?
Erik Funkenbusch over 5 years

@MichaelFreidgeim - I think the enabled property is still there, just not documented. It's possible it was removed and merely the presence of the element is enough to enable it, but I don't see any specific mentions of it.