eth0 NIC Link is Down repeating message in kernel log

70,621

Solution 1

  1. check for errors on the wire, look at the "errors" field in the output of ifconfig. If non-zero then there are problems with hardware (cable, NIC card, or hub/switch). An unreliable Ethernet cable will give errors in this field too.
  2. replace the Ethernet cable, regardless of step 1. This is quick, cheap and easy, and should be done whenever your link is going up and down at random intervals.
  3. use ethtool and make sure the network settings (duplex, etc) match those on the switch. If you are not the admin of the switch, then ask the network admin to provide you with the settings.
  4. if the switch has flow control enabled, then be sure it is enabled on your Linux box. Otherwise, disable it.

As a side note, you should assess whether you need flow control. According to HP, it is only necessary for high-performance applications: see HP article on When to Use Flow Control

Solution 2

Here's my fix. This problem happens on specific hardware (on one machine only 1 out of 2 ports on the NIC), always with the e1000e driver, since kernel 3.9 or so. This file is for centos7, goes in /etc/init.d/ and has to be enabled with chkconfig --add <name>. The interface name is hardcoded...be sure to set it.

#!/bin/sh

### BEGIN INIT INFO
# Provides:          pm-e1000e-fix
# Required-Start:    $network
# Required-Stop:     $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 6
# Short-Description: workaround for e1000e issue
# Description:       e1000e fix
### END INIT INFO

################################################################################
# Give Usage Information                                                       #
################################################################################
usage() {
    echo "Usage: $0 start|restart" >&2
    exit 1
}

################################################################################
# E X E C U T I O N    B E G I N S   H E R E                                   #
################################################################################
command="$1"
shift

interface="eth0"

case "$command" in
    start)
        ethtool -K "$interface" gso off gro off tso off
        ;;
    restart)
        ethtool -K "$interface" gso off gro off tso off
        ;;
    *)
        usage
        ;;
esac
Share:
70,621

Related videos on Youtube

Miloš Đakonović
Author by

Miloš Đakonović

Updated on September 18, 2022

Comments

  • Miloš Đakonović
    Miloš Đakonović over 1 year

    I've noticed since few days ago that same repeating kind of messages occurs and I positively can say that nothing was intentionally changed (installed/uninstalled) in that period.

    here's sample of /var/log/kern.log message:

    Mar 30 06:32:45 aurora kernel: [566322.867110] e1000e: eth0 NIC Link is Down
    
    Mar 30 06:32:47 aurora kernel: [566325.313634] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
    
    Mar 30 06:32:59 aurora kernel: [566337.632930] e1000e: eth0 NIC Link is Down
    
    Mar 30 06:33:18 aurora kernel: [566356.543664] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    
    Mar 30 11:05:47 aurora kernel: [582689.779752] e1000e: eth0 NIC Link is Down
    
    Mar 30 11:05:50 aurora kernel: [582692.174337] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
    

    from complete log file - when take all log message this kind into count - I can conclude:

    • eth0 fails every few hours
    • eth0 fails in first case for two and in second for 19 seconds

    It's production server I'm talking about here.

    How to solve this problem, since mail server is in production and network failures of 19 seconds duration I cannot tolerate?

    • Håkan Lindqvist
      Håkan Lindqvist about 10 years
      What have you checked so far? Is the cable properly attached and in unharmed condition? Does the switch on the other end also observe the link going down? Worth noting is that the detected link is different at different times (flow control differs in your log). Maybe the autonegotiation fails? Does the problem go away if you force 1000Mbps FD Rx/Tx?
    • Miloš Đakonović
      Miloš Đakonović about 10 years
      @HåkanLindqvist I don't have option to check cable, since server is not physically near me. Is that something I should ask server farm tech stuff to check? How do I force 1000Mbps FD Rx/Tx? And, about flow control being different at different times, is this issue?
    • Håkan Lindqvist
      Håkan Lindqvist about 10 years
      The link "type" changing over time suggests to me that something is not quite right but finding the actual cause is of course a separate question entirely. Asking the tech staff may be a good idea.
    • Paul Haldane
      Paul Haldane about 10 years
      You can use ethtool or mii-tool to check auto-negotiate status etc at the server end. You need to make sure that the switch your server is setup to match. This sounds like a hardware problem - could be server adapter, cable or switch. I suggest looking at the status of the switch to see what it thinks is happening.
  • Miloš Đakonović
    Miloš Đakonović about 9 years
    It was wire errors. Server farm tech stuff did the job after I've reported errors.
  • Michael Martinez
    Michael Martinez about 9 years
    'ifconfig' was showing errors?