SSH Connections freezing with "Write failed: Broken pipe"

15,529

Solution 1

It looks like the CentOS box's SSHD config is not set to do the client KeepAlive.

Drop these two lines in your CentOS sshd config (/etc/ssh/sshd_config), restart it, and enjoy!

KeepAlive yes
ClientAliveInterval 60

While you're at it, I'd recommend using gnu screen to keep your session alive on the CentOS side.

Solution 2

The actual answer is almost always that you have a NAT device of some sort in the path, usually a firewall, whose state tables have a fairly aggressive timeout. Because you leave your ssh connection idle for some periods of time, the NAT device "forgets" the mapping between your inside address and source port number, and your ephemeral outside NATted address and port number.

When you later try to do something in that ssh window, a new ephemeral address/port pair is assigned to you, which the destination ssh server has no knowledge of, and doesn't respond to; later, some local timeout is reached, and the connection is dropped by your local machine.

The practical fix for this is to do exactly what yuriismaster suggests: enable KeepAlives (which ensure regular traffic to "tickle" that state table entry), and use screen on the remote side (to preserve state in the event things do get dropped). I only post this answer because you asked what's happening, as well as what to do about it. Hopefully this clarifies why yuriismaster's suggestions are good ones.

Share:
15,529

Related videos on Youtube

Stephen RC
Author by

Stephen RC

Senior developer at Defiant / Wordfence, security analyst, Tolkien fan, and general geek.

Updated on September 18, 2022

Comments

  • Stephen RC
    Stephen RC over 1 year

    I am connecting to a CentOS 5.5 box via SSH from a Ubuntu 11.04 machine.

    The connection appears to work as expected when it is in active use (i.e. no lag or loss), but if it is left inactive for a while it will freeze up and become unresponsive. Eventually the error message "Write failed: Broken pipe" will be returned and I'll be back on my local machine's prompt.

    What sort of things can I do to help debug this, find out what is happening, and get this resolved? Being a developer, this is making my life a pain having to reconnect constantly.

  • Stephen RC
    Stephen RC about 13 years
    That makes perfect sense! We do have a NAT with DMZ setup for this box. I'll give the timeout configuration a try and see if that works for me. Thanks :)
  • Stephen RC
    Stephen RC about 13 years
    I'm accepting yours as you helped me understand the reasons behind the problem. But credit needs to go to @yuriismaster for the fix.
  • MadHatter
    MadHatter about 13 years
    Valorin: absolutely, it does, and he was first. Frankly, I think he deserves the accept more than me; but it's your question, so it should go as you see fit. Thanks for the feedback, either way.
  • ypid
    ypid almost 9 years
    KeepAlive as been renamed to TCPKeepAlive and can be left at the default value which is yes. ClientAliveInterval should be sufficient. See man sshd_config.