How can I debug frequent unrecoverable freezes?

5,908

Solution 1

Disable intel_cstates (power saving states meant to reduce CPU waste heat and power usage) by editing /etc/default/grub:

sudo nano /etc/default/grub

Find the line containing GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

Add intel_idle.max_cstate=1 directly following splash.

Alternatively, if your BIOS supports doing so, disable said C-states from there.

Note: this is not a long-term fix. Disabling C-states heavily increases power-draw and waste heat. Only try this if there are no other solutions and updating your kernel does not solve anything!

Solution 2

How to install Kernel 4.8.5

Although 4.8.7 is the latest kernel, in this 500-post, 1 years long, bug log (Bug 109051 - intel_idle.max_cstate=1 required on baytrail to prevent crashes) it is reported not to work. Just yesterday someone posted they tried 4.8.7, it crashed so they went back to 4.8.6.

Although the bug log title is for "Bay Trail" the solutions presented apply to other Intel platforms as users report. Because there are 582 posts spanning almost one year, I recommend pressing the End key after opening the link and scroll up from there.

I've been running 4.8.5 off and on again alongside with 4.4.0-47 for a couple of weeks and feel comfortable using either one. These are the instructions for installing kernel version 4.8.5:

cd /tmp
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.5/linux-headers-4.8.5-040805_4.8.5-040805.201610280434_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.5/linux-headers-4.8.5-040805-generic_4.8.5-040805.201610280434_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.5/linux-image-4.8.5-040805-generic_4.8.5-040805.201610280434_amd64.deb
sudo dpkg -i *.deb
sudo reboot

You can install any kernel by visiting the site: (http://kernel.ubuntu.com/~kernel-ppa/mainline/) and adapting the links there to the instructions above.

Share:
5,908

Related videos on Youtube

justfortherec
Author by

justfortherec

Using Ubuntu on my laptop for professional and private use.

Updated on September 18, 2022

Comments

  • justfortherec
    justfortherec over 1 year

    My new XPS 13 (9360 with KabyLake processor) with Ubuntu 16.04 pre-installed frequently freezes completely without any apparent reason.

    The freezes occur seemingly randomly. Sometimes the mouse pointer can still be moved for few seconds but eventually the system completely locks up. I am unable to switch to virtual terminals and not even SysRq codes seem to have any effect. All I can do is long press the power button for a hard power off after which the system boots normally.

    User processes running are mostly several Chrome tabs and a terminal.

    Things I have checked and tried include:

    After doing so the freezes still occur. Now I am at a loss. My question thus is:

    What are ways to find the cause of the issue?

    • Admin
      Admin over 7 years
      This sounds like a cstate bug that existed with Bay-Trail CPUs. I guess its worth a shot to try the fix, as well. Reboot your machine until you see the grub boot options. Click e to pull up commandline options. Then, add intel_idle.max_cstate=1 right after the words quiet splash, and boot. See if this works. You may need to file a bug in Launchpad. What kernel are you using?
    • Admin
      Admin over 7 years
      @PatrickNegus This is with kernel 4.4.0-47-generic. There is a BIOS option to disable C states. Does that essentially do the same thing? I'll try editing the commandline. With "a cstate bug that existed with Bay-Trail CPU" do you mean a kernel bug or a hardware bug? Is it worth trying newer mainline kernels?
    • Admin
      Admin over 7 years
      Kernel 4.8 has much, much better support for Kaby Lake then 4.4. So yes, please upgrade. Regarding the Bay-Trail bug, it was and still is a kernel bug that prevents Ubuntu from being able to effectively manage CPU sleep states (which save power for the CPU and gives much better idle efficiency).
    • Admin
      Admin over 7 years
      Thanks for your help, @PatrickNegus. Just to verify that we mean the same thing. Do you mean upgrading to the kernel of 16.10 like described in askubuntu.com/a/840184/63018?
    • Admin
      Admin over 7 years
      Yes, that's what I meant.
    • Admin
      Admin over 7 years
      You might want to check our newer kernel 4.8.7 changes rather than 4.8.0 that Ubuntu 16.04 ships with.
    • Admin
      Admin over 7 years
      @WinEunuuchs2Unix Who is "we" in "our" and where can I get that kernel?
    • Admin
      Admin over 7 years
      Before trying the newer kernel, I am trying to verify that disabling cstates helps (not a long term solution, but good to know anyway). Meanwhile, I have also found some Machine Check Exceptions (MCE) in the logs and wonder whether they potentially cause the issue. I have posted a separate question regarding the MCEs: unix.stackexchange.com/questions/324237/…
    • Admin
      Admin over 7 years
      @justfortherec after posting 4.8.7 I read bug report that it didn't work and user went back to 4.8.6: bugzilla.kernel.org/show_bug.cgi?id=109051
    • Admin
      Admin over 7 years
      I haven't experienced any freezes since disabling cstates. However, in contrast to Patrick's comment, I have disabled them in BIOS, not by editing the boot commandline. @PatrickNegus, do you want to turn your comment(s) into an answer that I can accept?
    • Admin
      Admin over 7 years
      @justfortherec Sure.
  • justfortherec
    justfortherec over 7 years
    Thanks for the info. However, I won't accept it as the answer to this topic, because it addresses a question I raised in the comment, not the original question of this thread.
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 7 years
    Indeed. Let me know if you try a new kernel and if it works though.
  • justfortherec
    justfortherec over 7 years
    My system stopped to freeze randomly after updating to any 4.8 kernel that I have tried (LTS 4.8.0-25.27~16.04.1 and mainline 4.8.12-040812.201612020431).
  • Korijn
    Korijn almost 7 years
    why exactly would this help?
  • negusp
    negusp almost 7 years
    @Korijn ... it's been a while, but there's been (or possibly resolved) a bug with Intel Bay Trail CPUs in which there's significant instability in the system when power saving states are enabled. Disabling c-states fixes the bug.