What to do with suddenly unreachable non-logging EC2 instance?

5,274

Solution 1

I run into very same problem recently. I'm quite new to EC2 in general, but with some help from Eric's blog I have managed to troubleshoot and resolve the issue, although I'm still not sure what it REALLY was. I think it possibly is missing kernel AKI for this particular AMI and its new updated kernel image (BTW, Im running the same AMI)

  • I stopped my instance, attached the volume to the new one (running on the same AMI). Had to play a bit with e2label and fstab.
  • Mounted old filesystem (including dev and proc) and chrooted to it
  • Upgraded kernel to the version one before the latest, as I couldnt find AKI corresponding with it. I had to change AKI Manually using EC2 API tools
  • Removed new EBS volume (fixing first partition labels) and booted back to the old volume

Im running now 2.6.32-318-ec2

Can someone correct me if I'm wrong pointing the missing AKI as the source of problem? Anyway it worked and I'm sure Ill test all upgrades on the test host first before applying it to the production system.

Solution 2

My solution/recovery was:

  • Instantiate a fresh instance with the Ubuntu 10.04 AMI ami-c00e3cb4 (promptly updated and upgraded and rebooting to linux-image-2.6.32-319-ec2 no problem).
  • re installed all the packages of importance
  • Mounted a snapshot of the old non-booting instance (made after it became non-booting) as a volume.
  • rsynced over the handful of /etc and /var and /home of importance

and it's back as it was before (with the advantage of being a little less crufty).

I didn't bother trying to boot a fresh instance with the problem image because... well, surely all the "state" lives in the disk image (which I can only guess suffered some boot-related corruption) so I wouldn't expect any different result.

Just "one of those things" I guess ?

In future I think I'll be snapshotting more regularly, and before any kernel updates.

Share:
5,274

Related videos on Youtube

timday
Author by

timday

"The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry." - Henry Petroski "What if we didn't take it to our limit...wouldn't we be forever dissatisfied ?" - Doug Scott "Problems are inevitable. Problems are soluble." - David Deutsch "The incremental increase in systemic complexity is rarely if ever recognized as a problem that additional complexity can't solve." - Charles Hugh Smith (OfTwoMinds blog) "If you don’t make mistakes, you’re not working on hard enough problems. And that’s a big mistake." - Frank Wilczek "Only those that risk going too far can possibly find out how far one can go." – T.S. Eliot "Engineers turn dreams into reality" - Giovanni Caproni (in Hayao Miyazaki's The Wind Rises) "When an Oxford man walks into the room, he walks in like he owns it. When a Cambridge man walks into the room, he walks in like he doesn't care who owns it." - my grandmother "The greatest scientific discovery was the discovery of ignorance" - Yuval Noah Harari. "Always train your doubt most strongly on those ideas that you really want to be true." - Sean Carroll "The first principle is that you must not fool yourself — and you are the easiest person to fool" - Richard Feynman "On the plains of hesitation lie the blackened bones of countless millions who at the dawn of victory lay down to rest, and in resting died." - Adlai E. Stevenson "Therefore Simplicio, come either with arguments and demonstrations and bring us no more Texts and authorities, for our disputes are about the Sensible World, and not one of Paper." - Salviati to Simplicio in Galileo's Dialogue On Two World Systems (1632) "The larger the island of knowledge, the longer the shoreline of wonder." - Ralph W. Sockman "I never enlighten anyone who has not been driven to distraction by trying to understand a difficulty or who has not got into a frenzy trying to put his ideas into words. When I have pointed out one corner of a square to anyone and he does not come back with the other three, I will not point it out to him a second time." - Confucius "The way to bring about the new age of peace and enlightenment is to assume it has already started" - ?

Updated on September 18, 2022

Comments

  • timday
    timday over 1 year

    I have an EC2 "micro instance" running Canonical's Ubuntu 10.04 LTS. Has been running for 6-9 months now, infrequently rebooted (once every few weeks at the most).

    I just did what I thought was a routine aptitude update, aptitude full-upgrade. On noticing there seemed to have been some new -ec2 linux images installed, I rebooted the system. While it seemed to reboot and go back to "running" status on the console, it didn't come back with its usual ssh and http services. I've tried stopping and starting it, re-associating it's elastic IP... no joy.

    The strange thing is, "Get System Log" (AWS console) returns a completely blank log. Empty. Nothing. Not one character. (At least it's empty after the first start-stop; before the stop it just contained a final line about restarting).

    I've tried a few stop-start cycles but no improvement.

    Any advice what to try next to get my instance back to life ?

    • Eric Hammond
      Eric Hammond over 12 years
      Is this an EBS boot instance or instance-store? What is the AMI id?
    • Eric Hammond
      Eric Hammond over 12 years
      I've edited your question to clarify that the Ubuntu 10.04 AMIs you are running were created by Canonical, not by Alestic (me). I list Canonical's Ubuntu AMI ids at the top of Alestic.com
    • timday
      timday over 12 years
      Misbehaving instance was created with ami-311f2b45 back around February'11; I've just used ami-c00e3cb4 to bring up a new instance no problem (see answer below). Both EBS backed.
  • timday
    timday over 12 years
    Thanks; nice to know it's not just me, and that there is some rational explanation.