Understanding Machine Check Exceptions (MCE)
Solution 1
First, I fear that I cannot really give good answers to your questions. I also own a Dell XPS 13 (9360) and see the same MCE messages. I'm in contact with Dell Support because of these. They replaced the mainboard but it did not help. Same messages in the logs. At some point they concluded that it is probably a false positive. They had no idea what is causing it, though (mcelog/kernel/Intel problem?). The correspondence with Support is still ongoing.
<rant>
Btw, talking to Dell Support is a very unpleasant experience. They seem to only suggest the "standard" solutions like resetting the Firmware, run self-health tests and so on. I didn't had the impression to talk to someone with some technical insight.
</rant>
To add more details, I see the same issue on Fedora 24 so it seems not to be related to Ubuntu.
Regarding your questions:
What do these errors mean and should I worry about them?
I don't know. Dell Support thinks those are false positives.
Could these hardware errors be the cause of the freezes of the entire system?
Besides the messages my system works fine. I'd guess the freeze is a different issue.
Should I have the laptop (or parts) replaced by the manufacturer?
Replacing the mainboard did not fix the MCE issue. It might solve the freezing issue, although it seems that this was fixed by a kernel update.
Are there any other actions I should take?
If you are not already in contact with Support, contact them. Maybe they will come up with a real solution once they see that it affects more customers.
Solution 2
I got the same mce errors, started popping up on boot on the last few kernel updates (Fedora 25), but I lost the track on which exact update this started appearing. The notebook is DELL Inspiron 5567 (Intel i5 7200U). However the system works perfectly fine after the boot, so I'm 100% sure this is fake positives appearing for some reason.
Related videos on Youtube
justfortherec
Using Linux on laptop and servers for professional and private use.
Updated on September 18, 2022Comments
-
justfortherec almost 2 years
While trying to debug frequent freezes of my new laptop (KabyLake architecture) running Ubuntu 16.04 I've stumbled upon these entries in
kern.log
:kernel: [ 0.041634] mce: [Hardware Error]: Machine check events logged
Since then I have installed
mcelog
but do not know what to make of the logs. Content of/var/log/mcelog
is:mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479298799 Wed Nov 16 13:19:59 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479298799 Wed Nov 16 13:19:59 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479321645 Wed Nov 16 19:40:45 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479321645 Wed Nov 16 19:40:45 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 43880000086 ADDR fef1db80 TIME 1479328438 Wed Nov 16 21:33:58 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 13880000086 ADDR fef1dc00 TIME 1479328438 Wed Nov 16 21:33:58 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 43880000086 ADDR fef1db80 TIME 1479333991 Wed Nov 16 23:06:31 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 13880000086 ADDR fef1dc00 TIME 1479333991 Wed Nov 16 23:06:31 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 43880000086 ADDR fef1db80 TIME 1479373350 Thu Nov 17 10:02:30 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 13880000086 ADDR fef1dc00 TIME 1479373350 Thu Nov 17 10:02:30 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479373810 Thu Nov 17 10:10:10 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee0000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479373810 Thu Nov 17 10:10:10 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee0000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479375712 Thu Nov 17 10:41:52 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479375712 Thu Nov 17 10:41:52 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479385932 Thu Nov 17 13:32:12 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479385932 Thu Nov 17 13:32:12 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 3880018086 ADDR fef1cf00 TIME 1479387666 Thu Nov 17 14:01:06 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 43880018086 ADDR fef1ff00 TIME 1479387666 Thu Nov 17 14:01:06 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 43880000086 ADDR fef1db80 TIME 1479456710 Fri Nov 18 09:11:50 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 13880000086 ADDR fef1dc00 TIME 1479456710 Fri Nov 18 09:11:50 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 6 MISC 43880000086 ADDR fef1db80 TIME 1479459374 Fri Nov 18 09:56:14 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142 mcelog: Family 6 Model 8e CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 1 CPU 0 BANK 7 MISC 13880000086 ADDR fef1dc00 TIME 1479459374 Fri Nov 18 09:56:14 2016 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS ee2000000040110a MCGSTATUS 0 MCGCAP c08 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 142
Some observations (please correct me if any of them are wrong):
- Almost all errors seem to occur on the same page (
ADDR fef1xxx
) - Only banks 6 and 7 seem to be affected.
- All entries contain "Error overflow" and "Uncorrected error".
The mcelog FAQ mentions that a "low rate of corrected memory errors is expected and does not require replacing hardware or other action". The log entries contain the phrase "Uncorrected error" which suggests I actually should take some action.
My questions are:
- What do these errors mean and should I worry about them?
- Could these hardware errors be the cause of the freezes of the entire system?
- Should I have the laptop (or parts) replaced by the manufacturer?
- Are there any other actions I should take?
- Almost all errors seem to occur on the same page (
-
justfortherec over 7 yearsThanks a lot for your insights. May I ask what Linux you are running to not experience the freezes? Indeed, updating to a 4.8 kernel fixed the issue for me. Are you running on stock Ubuntu 16.04? I will follow your advice and contact Dell.
-
Josef Eisl over 7 yearsI'm currently on an up-to-date Fedora 24 which comes with a 4.8.10 kernel. I did not use the stock Ubuntu 16.04 long enough to tell if there are problems. Good luck with support!
-
Josef Eisl over 7 yearsAnother update: Support was able to reproduce it on their test machine. This needs to be fixed upstream. They forwarded the issue internally to some department that will look into it (whatever that means). In addition they suggested to send error reports e.g. to Ubuntu.
-
radesix over 7 yearsNot that you need another "me too" but I have a new XPS 9360 and just installed Fedora 25 and get the same MCE errors. They always seem to happen a couple minutes after boot, then I'm fine (and nothing is broken, just annoying Oops messages)
-
Kan-Ru Chen over 7 yearsSame hardware (XPS 9360) and same MCE errors. I'm running Debian sid.
-
Scott about 7 yearsI too have this issue. Dell Precision 5520. Fedora 25, Kernel 4.10.8
-
Josef Eisl about 7 years@Scott is that also a KabyLake?
-
Scott about 7 years@JosefEisl yes. CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-7440HQ CPU @ 2.80GHz
-
NikhilWanpal about 7 yearsHate to say me too, but before I realised what MCE meant, I asked the same question on AskUbuntu, raised a dell support request, ran all hardware check tests (DellSupportCenterl and pre-boot test) all of which passed, and Dell told me that it was a 'driver' issue that occurred only when you dual-boot and apparently they have already raised it and Ubuntu Devs/ Intel are working on it (couldn't get a link to the issue report). So, for now, I can either remove Windows completely or live with it was their suggestion.
-
Josef Eisl over 6 years@NikhilWanpal I don't have a dual-boot setup.
-
NikhilWanpal over 6 years@JosefEisl ha! this was quite an old comment. In my case the issue was resolved by a subsequent BIOS update dell released for the laptop. I installed it while battling a different issue, related to sound card. but at least this is no longer a concern.
-
Josef Eisl over 6 years@NikhilWanpal glad to here that!