ubuntu server, ssh, write failed: broken pipe

12,465

Solution 1

This post resolved the issue: massive packet loss when servers are brought online

Solution 2

Use mtr to check the network. Try a command like mtr -i 15 remotehost. Leave this running in a window, or use screen so you can detach. It should catch any problems with the network. Packet loss is typically 0% on most of my systems.

EDIT: What does the output of arp -n show for your IP address before and after ssh drops. You may want to try this on another server on the same subnet. There should be only one HW address for the IP address and it should not change. If it does you have an IP address conflict.

Solution 3

Ok.. sooo from what i can assume from glancing at this...

your basically getting extended drop outs..

1.) You have a bad network connection..

2.) The network the server is on, has a bad network connection / bad router / bad something :P

3.) Your servers have conflicting addresses / problem hardware.

My solution..

Run a ping overnight.. and see how many packets you lose in the morning :D (just to see if i was heading in the right direction )

Hope this helps..

Solution 4

You can get flakey connections with certain NIC/switch combo's when autonegotiate is turned on, and it negotiates to half-duplex.

Use "ethtool eth0" to verify that the speed and duplex settings are correct, and to change them if you need to.

Share:
12,465

Related videos on Youtube

cmhobbs
Author by

cmhobbs

Mercenary janitor and network cowboy.

Updated on September 17, 2022

Comments

  • cmhobbs
    cmhobbs over 1 year

    I'm getting some bizarre behavior with Ubuntu Server 10.04 64bit on two of our new servers (both fresh installs). I have ubuntu server (same version) deployed on 4-5 other servers without this issue.

    Initially I cannot ssh into a fresh server install until I manually set the address that the ssh server is listening on in /etc/ssh/sshd_config. Once I've connected, I seem to be kicked out at random intervals with the following error:

    Write failed: Broken pipe

    Using "ssh -vv" doesn't show any other information. When I'm kicked out in this manner, I cannot reconnect for another seemingly random period of time. Sometimes a few seconds, others a few minutes. If I run "netstat -nap|grep :22", I can see that my connection still exists after the write failed error. I can't seem to re-connect until that connection drops.

    After one of these errors, if I hop onto the server from the console, ssh into another machine, and then attempt to ssh back into the server, everything works fine.

    Using "-o TCPKeepAlive=yes" client side doesn't seem to effect anything. I've disabled both iptables and ufw on the server. AppArmor is not showing any enforced profiles and SELinux isn't installed.

    My logs aren't reporting any errors and I don't have any custom configs. This is a box-stock install. Note that when I try to get back in after the broken pipe error, this is the error I get:

    ssh: connect to host 172.22.50.92 port 22: Connection refused

    And nmap no longer shows port 22 as being open, though netstat on the server says it's still listening on port 22.

    EDIT - I'm not sure if it means anything, but I've installed KVM on these hosts and I can ssh into the guests (ubuntu server 64bit as well) without any issue.

    UPDATE - I've tried purging openssh and re-installing with apt. I've also purged and installed openssh from source with no luck. traceroutes and pings overnight show no packet loss whatsoever.

    YET ANOTHER UPDATE - Dell seems to think that we've got a bad motherboard in the server. Having that replaced to see if it resolves the issue.

    • Admin
      Admin over 13 years
      Try to include the below options in your /root/.ssh/config file on client side. Host hostnameofthesever User root Hostname ipoftheseerver ServerAliveInterval 240 ServerAliveCountMax 4 It might help..
    • Admin
      Admin over 13 years
      I'm not using ssh as root. Will this apply for my local user account?
  • cmhobbs
    cmhobbs over 13 years
    I've already attempted this. 0% packet loss and the other functioning servers are on the same network (same switches, even).
  • Arenstar
    Arenstar over 13 years
    boom.. im stumped... any complaints in dmesg??? on the problem servers?
  • Arenstar
    Arenstar over 13 years
    unless you changed something in sshd.conf ( the standard install config would not give you these problems :/ ) Are they brand new machines??
  • cmhobbs
    cmhobbs over 13 years
    Just got them a week ago, fresh install on new machines. The other ones that work are new machines as well. That's the real head scratcher here: these two boxes are the same vendor (not same model) as the other machines, and these are different models from one another, so the I don't think it's a NIC issue. I think failing hardware would complain in syslog somewhere.
  • cmhobbs
    cmhobbs over 13 years
    I'll try that overnight. I really don't think packetloss is an issue.
  • Arenstar
    Arenstar over 13 years
    Are you running an ipmi/LOM type of management of the same port?
  • cmhobbs
    cmhobbs over 13 years
    None that I'm aware of, unless it shipped that way. They're all Dells. I'll dig through the BIOS.
  • cmhobbs
    cmhobbs over 13 years
    Additionally, DRAC isn't configured.
  • cmhobbs
    cmhobbs over 13 years
    It's currently full duplex, ethtool's report looks kosher.