e1000e Reset adapter unexpectedly / Detected Hardware Unit Hang
Solution 1
Ok so after posting this question last night night I continued to do some research the only real solution I came across seems to have taken care of the problem.
Disabling TSO, GSO and GRO using ethtool:
ethtool -K eth0 gso off gro off tso off
According to a post found here: http://ehc.ac/p/e1000/bugs/378/
From what I understand this will or can cause a reduction in performance.
I also noticed another solution was to disable Active-State Power Management
pcie_aspm=off
According to this post on serverfault: Linux e1000e (Intel networking driver) problems galore, where do I start?
I haven’t tried this solution yet. I will try it and see if that makes a difference and post back my findings.
EDIT:
Ok so I have tried turning off Active-State Power Management, pcie_aspm=off and this didn't have any effect. I continued to notice errors in my log file.
This may still work for some as some of the Intel nics have issues with different kernels of falling asleep when power management is enabled.
Solution 2
Disabling Enhanced C1 (C1E) in the BIOS fixed it for me.
Not sure if the lower power state of C1E is messing with the driver, or that there's an oops in the driver when the processor is in this state.
Anyway, problem solved.
Solution 3
Disabling only TCP Segmentation Offload (TSO) does the trick for me.
ethtool -K eth0 tso off
Note: It does not seem to be necessary to also disable Generic Receive Offload (GRO) and Generic Segmentation Offload (GSO), as it is recommended by various sources. As far as I learned, these are implemented purely in software, and should be safe. Don't sacrifice more performance than necessary.
Solution 4
I had the issue (triggering same kernel error as you and userspace SSH errors like "Corrupted MAC on
input
").
Solution
What worked for me was to disable TCP checksum offloading :
# ethtool -K eth0 tx off rx off
Clean & long-term integration of this with debian-ish /etc/network/interfaces:
#!/bin/bash
#
# Disables TCP offloading on all ifaces
#
# Inspired by: @Michelunik https://serverfault.com/a/422554/62953
RUN=true
case "${IF_NO_TOE,,}" in
no|off|false|disable|disabled)
RUN=false
;;
esac
# Other offloading options that could be disabled (not TCP related):
# sg tso ufo gso gro lro rxvlan txvlan rxhash
# see man ethtool
if [ "$MODE" = start -a "$RUN" = true ]; then
TOE_OPTIONS="rx tx"
for TOE_OPTION in $TOE_OPTIONS; do
/sbin/ethtool --offload "$IFACE" "$TOE_OPTION" off &>/dev/null || true
done
fi
Context
- Debian Jessie
- Kernel 4.7.0-0.bpo.1-amd64
- lspci
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I218-V (rev 04)
Related videos on Youtube
Kyle Coots
Hey Everyone! My name is Kyle Coots I am a Freelance Web Developer. I have been building websites and fixing computers since I was a teenager (almost 25 years). I love technology in general. I also love hunting and fishing (fishing more :) ), spending time with my wife and kids, and working on cars too. If you need anything related to website development or computer problems reach out to me I will do what I can to help. Sincerely, Kyle Coots
Updated on September 18, 2022Comments
-
Kyle Coots over 1 year
I have a Dell 1U Server with Intel(R) Xeon(R) CPU L5420 @ 2.50GHz, 8 cores running Ubuntu Server Kernel Version 3.13.0-32-generic on x86_64. It has dual 1000baseT networking cards. I have it set up to forward packets from eth0 to eth1.
I have noticed that in my kern.log file it keeps hanging then resting. This is happening often. This happens every few second then maybe it will be ok for a few minutes then back to every few seconds.
Here is the log file dump:
[118943.768245] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang: [118943.768245] TDH <45> [118943.768245] TDT <50> [118943.768245] next_to_use <50> [118943.768245] next_to_clean <43> [118943.768245] buffer_info[next_to_clean]: [118943.768245] time_stamp <101c48d04> [118943.768245] next_to_watch <45> [118943.768245] jiffies <101c4970f> [118943.768245] next_to_watch.status <0> [118943.768245] MAC Status <80283> [118943.768245] PHY Status <792d> [118943.768245] PHY 1000BASE-T Status <7800> [118943.768245] PHY Extended Status <3000> [118943.768245] PCI Status <10> [118944.780015] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly
Here is the info from ethtool:
Settings:
Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on MDI-X: off (auto) Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000007 (7) drv probe link Link detected: yes
Driver info:
ethtool -i eth0 driver: e1000e version: 2.3.2-k firmware-version: 1.4-0 bus-info: 0000:00:19.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no
What could be causing this? Is this just a bug in the software or a actual hardware issue? I have seen many other having similar issues but no real solution and this also leads me to believe that its a software issue?
Maybe someone can shed some light on this for me?
-
Admin almost 8 yearsSeems to be the problem is known: bugzilla.kernel.org/show_bug.cgi?id=47331
-
-
Peter about 9 yearsThanks! I tried the ethtool fix, and it solved my issue. (also stuck it in an init script)
-
Tails almost 8 yearsThis was exactly the fix that worked for me. Running Ubuntu 16.04 LTS on a ASRock H170M-ITX/DL motherboard. Thanks SteveG. =)
-
godzillante over 7 yearsHi, do you know if running
ethtool -K eth0 gso off gro off tso off
will drop the connection, even for a short time? -
Oleg Gryb over 6 yearsIndeed, disabling options with ethtool helped, disabling power management options didn't
-
Mike McCabe almost 6 years'According to a post found here: ehc.ac/p/e1000/bugs/378' above now goes to a domainsquatter, original content can be found here: web.archive.org/web/20160205153351/http://ehc.ac:80/p/e1000/…
-
Flatron over 5 yearsmind that this may increase the servers power consumption a lot!
-
Anuj Shah over 3 yearsWorked for me with on CentOS 7, Kernel 3.10.0-1160.11.1.el7.x86_64, Device: 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
-
Luc H about 3 years@godzillante for future reference: It can drop the connection for a couple of seconds, however clients will not be disconnected unless they timeout depending on your application.
-
laimison over 2 yearsno downtime noticed too
-
user249654 about 2 yearsIntel NUC BOXNUC8i7BEH2 sudo ethtool -K eno1 tso off gso off