Machine check events logged

15,543

Solution 1

For more information check logfile (this log file might be there or might not be, it depends how it is configured in /etc/mcelog/mcelog.conf) where should be detail description of the problem found.

/var/log/mcelog

or just run command

mcelog


Mcelog is decoding kernel machine check log on x86 machines. From man mcelog:

X86  CPUs  report  errors  detected by the CPU as machine check events (MCEs).  These
can be data corruption detected in the CPU caches, in main memory by an integrated
memory controller, data transfer errors on the front side bus or CPU interconnect or
other internal errors. Possible causes can be cosmic radiation, instable power
supplies, cooling problems, broken hardware, or bad luck.
Most  errors  can  be  corrected by the CPU by internal error correction mechanisms.
Uncorrected errors cause machine check exceptions which may panic the machine.
When a corrected error happens the x86 kernel writes a record describing the MCE into
a internal ring buffer available through  the  /dev/mcelog device  mcelog retrieves
errors from /dev/mcelog, decodes them into a human readable format and prints them on
the standard output or optionally into the system log.


You can find more information about mcelog and its configuration/errors/triggers on the project webpage Mcelog project webpage

Solution 2

mcelog was removed in Debian 10+ (Buster) and Ubuntu 18.04+

The functionality has been replaced by rasdaemon.

Solution 3

The log entries were written by mcelog. Its logfile can be found in /var/log/mcelog, or depending on the system, additionally in syslog or the systemd journal.

X86 CPUs have the ability to detect and sometimes correct hardware errors (memory, IO, and CPU hardware errors). mcelog retrieves these errors from /dev/mcelog, where the Linux kernel writes then.

As your system crashed, correction of the hardware likely failed. If the system keeps running, auto-correction seems to be working.

For more background about the implications of seeing such messages, refer to “mce: [Hardware Error]: Machine check events logged” appears in syslog. What should I do?

Share:
15,543
GoldenNewby
Author by

GoldenNewby

Updated on September 18, 2022

Comments

  • GoldenNewby
    GoldenNewby over 1 year

    In /var/log/messages, this error occurred:

    Sep 19 13:18:15 wdc kernel: [2772302.630416] Machine check events logged
    

    Shortly there after, the entire server became unresponsive. This is in the log of the Dom0 for a Xen Server (running the latest version on Debian Squeeze).

    Can anyone shed some light on what this error means? Should I be ordering new hardware?

    Edit: Also, it seems to imply it logged something, where can I find that?

  • serverAdmin123
    serverAdmin123 almost 5 years
    Question asked in 6 years back
  • Firefishy
    Firefishy almost 5 years
    And search results live forever ;-) Good to shortcut technical mazes.