How can I track the cause of random reboots?

94,043

Solution 1

Check /proc/sys/kernel/panic; if its value is 1 then the server will immediately reboot on panic. Buggy drivers can cause a kernel panic.

If it's not a panic check last issue of reboot, maybe overheating is the issue.

last reboot

Solution 2

Commands

  1. dmesg - May not show items from before last boot, but very useful if the system is still up

Files

  1. /var/log/syslog - System wide logger, use tail /var/log/syslog or less /var/log/syslog
  2. /var/log/kern.log - Kernel log, same as above
  3. /var/log/*

Solution 3

TL;DR: @insider's answer, along with the comments by @Antonios Hadjigeorgalis led me to find that I had

Unattended-Upgrade::Automatic-Reboot "true"

in

/etc/apt/apt.conf.d/99custom-unattended-upgrades

I was experiencing sudden reboots, mostly soon after turning my laptop on in the morning. I'm running Ubuntu 18.04. Running last reboot showed that the kernel version was usually newer after the sudden reboots:

reboot   system boot  4.15.0-112-gener Wed Jul 22 10:07   still running
reboot   system boot  4.15.0-111-gener Wed Jul 22 10:01 - 10:06  (00:04)
...
reboot   system boot  4.15.0-111-gener Wed Jul 15 09:49 - 23:43  (13:53)
reboot   system boot  4.15.0-109-gener Wed Jul 15 09:45 - 09:48  (00:03)
...
reboot   system boot  4.15.0-109-gener Fri Jul  3 09:14 - 17:37  (08:23)
reboot   system boot  4.15.0-108-gener Fri Jul  3 09:08 - 09:13  (00:05)

Looking into /etc/apt/apt.conf.d/50unattended-upgrades, I saw that "Unattended-Upgrade::Automatic-Reboot" was commented out, and its default is supposedly false. I then ran the following:

grep Reboot /etc/apt/apt.conf.d/*
/etc/apt/apt.conf.d/50unattended-upgrades:Unattended-Upgrade::Automatic-Reboot "false";
/etc/apt/apt.conf.d/50unattended-upgrades://Unattended-Upgrade::Automatic-Reboot-Time "02:00";
/etc/apt/apt.conf.d/99custom-unattended-upgrades:// Reboot automatically if necessary (e.g. on kernel upgrade), should be
/etc/apt/apt.conf.d/99custom-unattended-upgrades:Unattended-Upgrade::Automatic-Reboot "true";

And there was my problem - Unattended-Upgrade::Automatic-Reboot "true"; in /etc/apt/apt.conf.d/99custom-unattended-upgrades.

Solution 4

Application crashes have crash files in /var/crash/; I'd also explore normal system logs which are your best bet. If the hardware shutdown you won't see anything in the systemd & message logs (a HUGE clue!!). If Ubuntu was aware of shutdown you'll see that too as you'll see reasons for shutdown. (If no details are found you'll need to check machine logs; ie. HOST OS if VM or hardware logs if on metal)

To look at app crashes on this box

guiverc@d960-ubu2:/de2900/ubuntu_64$   ls -la /var/crash
total 113484
drwxrwsrwt  2 root    whoopsie      4096 Feb 27 12:00 .
drwxr-xr-x 16 root    root          4096 Nov 29  2018 ..
-rw-------  1 root    whoopsie   1214905 Feb 26 08:28 irssi-scripts.0.crash
-rw-------  1 root    whoopsie   1193193 Feb 25 15:04 lvm2.0.crash
-rw-r-----  1 guiverc whoopsie 101162337 Feb 19 13:00 _usr_bin_clementine.1000.crash
-rw-r-----  1 guiverc whoopsie   5962296 Feb 26 23:31 _usr_bin_gnome-control-center.1000.crash
-rw-r-----  1 guiverc whoopsie   1519149 Feb 20 08:28 _usr_bin_light-locker.1000.crash
-rw-r-----  1 guiverc whoopsie   1327084 Feb 27 12:00 _usr_bin_totem-video-thumbnailer.1000.crash
-rw-r-----  1 guiverc whoopsie     96196 Feb 22 13:55 _usr_games_sgt-launcher.1000.crash
-rw-r-----  1 guiverc whoopsie   3685288 Feb 22 00:34 _usr_lib_ibus_ibus-ui-gtk3.1000.crash

Starting with application crashes is easy, so I'd look there first, however I can't really think of why a application crash could cause a reboot or shutdown; so I'd not expect to see anything meaningful there (if it's useful; it'll be after system logs).

To view system messages (for current session) you can use dmesg. Because it'll show the current session only, you won't see a reason for the last shutdown (that was last session), but after an unclean shutdown I'd expect to see results of a fsck (because of unplanned shutdown).

The best clues however are in systemd journals, or journalctl. This is where I'd really look for clues on last shutdown, ie. it's here where I'd expect to see the lack of normal shutdown messages which means it's a clue of hardware shutdown (eg. cpu shutoff because of extreme heat threshold; a pin gets grounded with OS having no clue so messages just stop! and next message is normal booting of next session; such messages will be found in hardware logs assuming an enterprise server; consumer grade usually don't keep hardware logs).

Sometimes you can see clues of overheating in logs anyway; bad if the PSU has issues (PWR_GOOD drops) nothing will be found as CPU wasn't even aware of shutdown; I suspect hardware logs may miss this type of shutdown too (but lack of entries is still a clue!)

To further narrow down where to look though, will depend on what type of server, what is running on it, and details that haven't been provided.

Share:
94,043

Related videos on Youtube

E. Arslan
Author by

E. Arslan

Updated on September 18, 2022

Comments

  • E. Arslan
    E. Arslan almost 2 years

    A Thinkpad X220 (Core-i5, SandyBridge, Intel GMA) running Precise 64-bit has rebooted hard twice in the last four days. I was doing nothing more than writing an email. No warning. It just went black, and the next thing I saw was the Lenovo boot screen.

    Where should I look to find the cause? I fear that this immediate reboot does not leave time for logs to be written...

    Thanks!

  • E. Arslan
    E. Arslan over 11 years
    Thanks so much for your answer. The problem is - as stated in my question - that the system is not up anymore. I believe that I had some kind of hardware issue.
  • Huckle
    Huckle over 11 years
    I went on a quick google hunt to check if the above files persist across reboots (since I'm not infront of a linux computer atm) and I found this site, which I found extremely informative: softpanorama.info/Commercial_linuxes/troubleshooting.shtml
  • Antonios Hadjigeorgalis
    Antonios Hadjigeorgalis about 10 years
    This led me to checking etc/apt/apt.conf.d/50unattended-upgrades I had set Unattended-Upgrade::Automatic-Reboot "true"; I changed it back to the default setting of false.
  • Aquarius Power
    Aquarius Power over 9 years
    @Antonios does this auto reboot is without warning?
  • Aquarius Power
    Aquarius Power over 9 years
    @insider I got a strange log as the previous session is linked to the current one? 18:30 - 18:38 (00:07) and before the sudden reboot 13:27 - 18:38 (05:11)
  • Aquarius Power
    Aquarius Power over 9 years
    last |head -n 50 wow, I just found this: 13:55 - crash (04:35) what gonna track how that happened, thx!
  • tchakravarty
    tchakravarty almost 8 years
    Can you elaborate on how the output of dmesg is supposed to help? My machine has just randomly rebooted, and if I understand your suggestion right, then dmesg cannot help here, right?
  • Aquarius Power
    Aquarius Power over 5 years
    I would like to add this grep "Jan 11 01" /var/log/* -rah 2>/dev/null |sort so you can target the day and the hour (ubuntu 16.04 but some info are version independent)