Amazon EC2 - No SSH After Reboot, Connection Refused

90,440

Solution 1

Had a similar behavior today on my ec2 instance, and tracked down the thing to this: when I do sudo reboot now the machine hangs and I have to restart it manually from the aws management console when I do sudo reboot it reboots just fine. Apparently "now" is not a valid option for reboot as pointed out here https://askubuntu.com/questions/397502/reboot-a-server-from-command-line

thoughts?

Solution 2

From the AWS Developer Forum post on this topic:

Try stopping the broken instance, detaching the EBS volume, and attaching it as a secondary volume to another instance. Once you've mounted the broken volume somewhere on the other instance, check the /etc/sshd_config file (near the bottom). I had a few RHEL instances where Yum scrogged the sshd_config inserting duplicate lines at the bottom that caused sshd to fail on startup because of syntax errors.

Once you've fixed it, just unmount the volume, detach, reattach to your other instance and fire it back up again.

Let's break this down, with links to the AWS documentation:

  1. Stop the broken instance and detach the EBS (root) volume by going into the EC2 Management Console, clicking on "Elastic Block Store" > "Volumes", the right-clicking on the volume associated with the instance you stopped.
  2. Start a new instance in the same region and of the same OS as the broken instance then attach the original EBS root volume as a secondary volume to your new instance. The commands in step 4 below assume you mount the volume to a folder called "data".
  3. Once you've mounted the broken volume somewhere on the other instance,
  4. check the "/etc/sshd_config" file for the duplicate entries by issuing these commands:
    • cd /etc/ssh
    • sudo nano sshd_config
    • ctrl-v a bunch of times to get to the bottom of the file
    • ctrl-k all the lines at the bottom mentioning "PermitRootLogin without-password" and "UseDNS no"
    • ctrl-x and Y to save and exit the edited file
  5. @Telegard points out (in his comment) that we've only fixed the symptom. We can fix the cause by commenting out the 3 related lines in the "/etc/rc.local" file. So:
    • cd /etc
    • sudo nano rc.local
    • look for the "PermitRootLogin..." lines and delete them
    • ctrl-x and Y to save and exit the edited file
  6. Once you've fixed it, just unmount the volume,
  7. detach by going into the EC2 Management Console, clicking on "Elastic Block Store" > "Volumes", the right-clicking on the volume associated with the instance you stopped,
  8. reattach to your other instance and
  9. fire it back up again.
Share:
90,440

Related videos on Youtube

SteadH
Author by

SteadH

Updated on September 18, 2022

Comments

  • SteadH
    SteadH over 1 year

    I've replicated this two or three times, so I'm guessing there's something wrong with what I'm doing.

    Here are my steps:

    1. Launch new instance via EC2 Management console using: Ubuntu Server 13.10 - ami-ace67f9c (64-bit)
    2. Launch with defaults (using my existing key pair)
    3. The instance starts. I can SSH to it using Putty or the Mac terminal. Success!
    4. I reboot the instance
    5. 10 minutes later, when the instance should be back up and running, my terminal connection shows:

      stead:~ stead$ ssh -v -i Dropbox/SteadCloud3.pem [email protected]
      OpenSSH_5.6p1, Op`enSSL 0.9.8y 5 Feb 2013
      debug1: Reading configuration data /etc/ssh_config
      debug1: Applying options for *
      debug1: Connecting to 54.201.200.208 [54.201.200.208] port 22.
      debug1: connect to address 54.201.200.208 port 22: Connection refused
      ssh: connect to host 54.201.200.208 port 22: Connection refused
      stead:~ stead$
      

    Fine, I understand that the public IP address can change, so checking the EC2 management console, I verify that it is the same. Weird. Just for fun, I try connecting with public DNS hostname: ec2-54-201-200-208.us-west-2.compute.amazonaws.com. No dice, same result.

    Even using the Connect via Java SSH client built into the EC2 console, I get Connection Refused.

    I checked the security groups. This instance is in group launch-wizard-4. Looking at the inbound configuration for this group, Port 22 is allowed in from 0.0.0.0/0, so that should be anywhere. I know that I'm hitting my instance and this is the right security group, because I can't ping the instance. If I enable ICMP for this security group, all of a sudden my pings go through.

    I've found a few other posts around the internet with similar error messages, but most seem to be easily resolved by tweaking the firewall settings. I've tried a few of these, with no luck.

    I'm guessing there's a simple EC2 step I'm missing. Thanks for any help you can give, and I'm happy to provide more information or test further!

    Update - Here are my system logs from the Amazon EC2 console: http://pastebin.com/4M5pwGRt

    • APZ
      APZ over 10 years
      I would suggest that look in to system logs on AWS console to see if it says something didn't went well while rebooting , you might want to be sure that both accessibility checks pass when system reboots and when u are trying to ssh (on console only)
    • typositoire
      typositoire over 10 years
      You did nothing after the first connection ? No messing with IP tables or sshd config files ? Because it looks like you are dropping the connection not that port 22 is not available.
    • David Levesque
      David Levesque over 10 years
      Did you mess with /etc/fstab before rebooting?
    • SteadH
      SteadH over 10 years
      No changing of iptables or fstab before rebooting. First command I ran was "reboot now" I'll update above with my AWS System Logs
    • SteadH
      SteadH over 10 years
      Also, status checks are both good - 2/2! I was hoping I had something simple wrong with my set up... maybe not!
    • APZ
      APZ over 10 years
      I don't think that you are missing out on any steps in provisioning a server. You might want to change the AMI and see how it goes, although I don't have a very concrete reason for changing AMI but since we have no real idea of what is happening this might worth the shot.
    • SteadH
      SteadH about 10 years
      Changing the AMI did it! That version of Ubuntu Server was just funky.
    • Akki
      Akki about 4 years
      How are you rebooting via the ssh command reboot or via AWS web login user interface? I was facing same issue after rebooting via SSH putty, using AWS web login to stop and start the instance later solved my issue.
    • ha9u63ar
      ha9u63ar over 3 years
      There is another thing which I experienced this morning, and totally overlooked it through my Googling. Usually, it's a good practice to declare which source you're setting your SSH for. However, you have to remember that your IP is also dynamic when WFH. Your Security Group should therefore have Source as 0.0.0.0/0 - but that's okay because you have the right PK to authenticate yourself. I also started detaching my EBS volume, but then paid attention to the above to fix the real problem.
  • Jeromy French
    Jeromy French almost 10 years
    This question might also be relevant: serverfault.com/q/325140/153062
  • Jeromy French
    Jeromy French almost 10 years
    Same issue and similar proposed fix at stackoverflow.com/a/21563478/1430996 The comment is particularly helpful.
  • SteadH
    SteadH almost 10 years
    Thanks for this! I suspect this would have fixed the issue, and that's a good way to get at that SSH log. Thanks!
  • SteadH
    SteadH almost 10 years
    Awesome! I tried this on my instance today and it worked. Thank you!
  • SteadH
    SteadH almost 9 years
    No editing fsab, just a reboot command.
  • cucu8
    cucu8 about 6 years
    This worked, thanks. Although my problem (same symptom: "connection refused") was due to wrong ownership of the dir /var/empty/sshd. It should have been root:root. Why did it change: no idea, we were never even close it. Oh well.
  • Vaibhav Kumar
    Vaibhav Kumar about 6 years
    @oromoiluig how one can reboot, if not able ssh the machine?
  • Vaibhav Kumar
    Vaibhav Kumar about 6 years
    @JeromyFrench I have the same issue. I followed the procedure but I didn't get the '"PermitRootLogin without-password"'. It has "PermitRootLogin=prohibit-password". What should i do?
  • oromoiluig
    oromoiluig about 6 years
    @VaibhavKumar from the AWS console: turn the instance off and back on again.
  • Umair Malhi
    Umair Malhi over 2 years
    This worked for me, after reboot from the command sudo reboot, the instance stopped working and even Rebooting from AWS console did not work. So I stopped the instance and then started it again. Now it is connecting fine. Not sure why this has been downvoted.
  • Falco
    Falco over 2 years
    @oromoiluig And this is important! It only seems to work if you STOP and START the instance via EC2-Webinterface. If you only click reboot the instance remains stuck.