Detecting socket disconnection?
Solution 1
You have discovered that you need timers, and heartbeat on a TCP connection.
If you unplug the network cable, the TCP connection may not be broken. If you have nothing to send, the TCP/IP stack have nothing to send, it doesn't know that a cable is gone somewhere, or that the peer PC suddenly burst into flames. That TCP connection could be considered open until you reboot your server years later.
Think of it this way; how can the TCP connection know that the other end dropped off the network - it's off the network so it can't tell you that fact.
Some systems can detect this if you unplug the cable going into your server, and some will not. If you unplug the cable at the other end of e.g. an ethernet switch, that will not be detected.
That's why one always need supervisor timers(that e.g. send a heartbeat message to the peer, or close a TCP connection based on no activity for a given amount of time) for a TCP connection,
One very cheap way to at least avoid TCP connections that you only read data from, never write to, to stay up for years on end, is to enable TCP keepalive on a TCP socket - be aware that the default timeouts for TCP keepalive is often 2 hours.
Solution 2
Neither those answers applies. The first one concerns the case when the connection is broken, and the second one (mine) concerns the case where the peer closes the connection.
In a TCP connection, unless data is being sent or received, there is in principle nothing about pulling a cable that should break a connection, as TCP is deliberately designed to be robust across this sort of thing, and there is certainly nothing about it that should look to the local application like the peer closing.
The only way to detect a broken connection in TCP is to attempt to send data across it, or to interpret a read timeout as a lost connection after a suitable interval, which is an application decision.
You can also set TCP keep-alive on to enable detection of broken connections, and in some systems you can even control the timeout per socket. Not via Java however, so you are stuck with the system default, which should be two hours unless it has been modified.
Your code should call keyIterator.remove() after calling keyIterator.next().
Comments
-
neevek almost 2 years
I am kinda upset that this cannot be handled in an elegant way, after trying different solutions (this, this and several others) mentioned in answers to several SO questions, I still could not manage to detect socket disconnection (by unplugging cable).
I am using NIO non-blocking socket, everything works perfectly except that I find no way of detecting server disconnection.
I have the following code:
while (true) { handlePendingChanges(); int selectedNum = selector.select(3000); if (selectedNum > 0) { SelectionKey key = null; try { Iterator<SelectionKey> keyIterator = selector.selelctedKeys().iterator(); while (keyIterator.hasNext()) { key = keyIterator.next(); if (!key.isValid()) continue; System.out.println("key state: " + key.isReadable() + ", " + key.isWritable()); if (key.isConnectable()) { finishConnection(key); } else if (key.isReadable()) { onRead(key); } else if (key.isWritable()) { onWrite(key); } } } catch (Exception e) { e.printStackTrace(); System.err.println("I am happy that I can catch some errors."); } finally { selector.selectedKeys().clear(); } } }
While the SocketChannels are being read, I unplug the cable, and
Selector.select()
starts spinning and returning 0, now I have no chance to read or write the channels, because the main reading & writing code is guarded byif (selectedNum > 0)
, now this is the first confusion coming out of my head, from this answer, it is said that when the channel is broken, select() will return, and the selection key for the channel will indicate readable/writable, but it is apparently not the case here, the keys are not selected,select()
still returns 0.Also, from EJP's answer to a similar question:
If the peer closes the socket:
- read() returns -1
- readLine() returns null
- readXXX() throws EOFException, for any other X.
Not the case here either, I tried commenting out
if (selectedNum > 0)
and usingselector.keys().iterator()
to get all the keys regardless whether or not they are selected, reading from those keys does not return -1 (0 returned instead), and writing to those keys does not getEOFException
thrown. I only noted one thing, that even the keys are not selected,key.isReadable()
returns true whilekey.isWritable()
returns false (I guess this might be because I didn't register the keys for OP_WRITE).My question is why Java socket is behaving like this or is there something I did wrong?
-
neevek over 11 yearsHey, @EJP, I knew you would come to rescue, thank you. You can also set TCP keep-alive on to enable detection of broken connections, what role does keep-alive play here in broken connection detection? what's the difference between setting and not setting keep-alive? as for
keyIterator.remove()
, I already usedselector.selectedKeys().clear()
in finally block. -
neevek over 11 yearsYour explanation completely clears out my confusion. but there's still one thing that I want to know, sometimes when I plug the cable back in, the connection is resumed, sometimes it is not, why is that?
-
user207421 over 11 years@neveek Missed that. TCP keep-alive sends a packet every now and then, one that requires a response, and if it doesn't arrive (taking retries and timeouts into account) the connection will be deemed broken: you will get a 'connection reset' on the next I/O.
-
neevek over 11 yearsI am not implementing a custom protocol, I am using HTTP, so if a packet is sent over the wire every now and then, will that packet be interpreted as part of the HTTP header or body? and if I, as the client, receive the keep-alive packet, how do I handle that?
-
nos over 11 years@neevek Likely the transmitting end times out. The transmitting end will detect that the other end is gone, as it doesn't receive any acks, so among other tings it'll depend on whether you plug the cable back in before the tcp stack times out the connection.
-
user207421 over 11 years@neveek The keep-alive packet is an ACK. It isn't seen by the application at all.
-
neevek over 11 yearsAwesome~ I learnt something. Both answers are correct, but I have to choose one. Thank you!
-
user207421 about 10 yearsI should correct my comment above. If keepalive trips a broken connection you will get ECONNTIMEOUT or whatever it is, 'connection timed out'. Note the different wording from 'connect timed out', which is a connect-time problem.
-
neevek about 10 yearsGot it! But I am wondering how you still remember this comment after 1 year? :)
-
user207421 almost 7 yearsOne does not 'always need supervisor timers'. HTTP, the most used application protocl on the planet, doesn't have one, for example. Read timeouts and IOExceptions on write are sufficient.
-
nos almost 7 yearsI consider read timeouts a supervisor timer, which you normally have to explicitly enable on sockets. And if you don't you risk not detecting a stale connection (You in this case might be an implementor of an HTTP server/client)..