CPU hardware errors in Ubuntu 17.04

23,394

The CPU is overheating and going into MCE (machine check events)... ie: it's crashing. If you don't see other temperature related events in syslog, it's probably because your CPU cooler/fan/thermal pipe/thermal paste isn't doing the job.

  • Check syslog with this terminal command...

    grep -i -e temp -e therm /var/log/syslog*
    
  • If the machine is very dirty/dusty, that could play a major role in the machine overheating. Clean it out.

  • If your machine has Intel processors, make sure that intel-microcode is installed.

    sudo apt-get update
    sudo apt-get install intel-microcode
    reboot
    
  • Install thermald to try and control the temperature.

    sudo apt-get update
    sudo apt-get install thermald
    reboot
    
  • Check your BIOS version. Enter your BIOS at power on time, and note the version #. Go to the manufacturer's web site with the make/model of your computer. Go to the support/downloads section, and look to see if there's a newer BIOS.

  • Lastly, and very likely, if this is an older machine, the thermal compound that sits between the processors and its heat pipe/fan cooler needs to be re-applied. This requires some technical experience.

Share:
23,394
M.Voyles
Author by

M.Voyles

Updated on September 18, 2022

Comments

  • M.Voyles
    M.Voyles over 1 year

    Can anyone explain to me what these error messages I got when I looked in dmesg? I am new to Ubuntu and to the Linux World.

    [ 7.802351] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802352] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802353] CPU5: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802354] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802354] CPU4: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802356] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802356] mce: [Hardware Error]: Machine check events logged
    [ 7.802362] mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 128: 00000000880a0003
    [ 7.802363] mce: [Hardware Error]: TSC 99561677c
    [ 7.802385] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 1 microcode ba
    [ 7.802387] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 00000000880a0003
    [ 7.802387] mce: [Hardware Error]: TSC 995616be4
    [ 7.802388] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 0 microcode ba
    [ 7.802389] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802390] CPU6: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802391] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.802392] CPU7: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 7.826359] CPU4: Core temperature/speed normal
    [ 7.826359] CPU0: Core temperature/speed normal
    [ 7.826360] CPU2: Package temperature/speed normal
    [ 7.826361] CPU6: Package temperature/speed normal
    [ 7.826361] CPU0: Package temperature/speed normal
    [ 7.826362] CPU4: Package temperature/speed normal
    [ 7.826363] mce: [Hardware Error]: Machine check events logged
    [ 7.826367] mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 128: 00000000880b0002
    [ 7.826368] mce: [Hardware Error]: TSC 99916f004
    [ 7.826369] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 1 microcode ba
    [ 7.826369] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 00000000880b0002
    [ 7.826370] mce: [Hardware Error]: TSC 99916f2ca
    [ 7.826370] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 0 microcode ba
    [ 7.826400] CPU1: Package temperature/speed normal
    [ 7.826401] CPU5: Package temperature/speed normal
    [ 7.826402] CPU3: Package temperature/speed normal
    [ 7.826402] CPU7: Package temperature/speed normal
    [ 467.922330] CPU4: Core temperature above threshold, cpu clock throttled (total events = 73)
    [ 467.922331] CPU0: Core temperature above threshold, cpu clock throttled (total events = 73)
    [ 467.922332] CPU7: Package temperature above threshold, cpu clock throttled (total events = 86)
    [ 467.922333] CPU3: Package temperature above threshold, cpu clock throttled 
    

    I am running Ubuntu 17.04 with 4.10.0-29-generic kernal

  • gene_wood
    gene_wood almost 6 years
    What impact does the intel-micorocode have on the temperature?
  • Boris Hamanov
    Boris Hamanov almost 6 years
    @gene_wood as I mention, it's probably a dust, fan, or thermal compound problem, and checking microcode is just another step in helping to diagnose the problem remotely. CPU's running old microcode can cause various problems. Seconds to check for it.