"RTNETLINK answers: Input/Output error" given when trying to bring interface up

8,447

I found this series of bug reports related to the same kind of problem: bug.launchpad, bugzilla.kernel.org 1, bugzilla.kernel.org 2, bugzilla.kernel.org 3.

Quoting Emmanuel Grumbach (egrumbach) in the first link above:

This is an electrical problem. I can't do anything about it

so it seems to be a problem arising from bugs/faults in the physical card.

Share:
8,447
glS
Author by

glS

Updated on September 18, 2022

Comments

  • glS
    glS over 1 year

    This issue seems to have happened to many people in the last years, and I could find it discussed in several forums and questions around. However, most of such discussions ended up dying without clear solutions, or were not clearly stated, hence my trying it again here.

     The problem

     Connection dies silently

    I'm trying to connect to a public wifi network (a university network to be precise) using my laptop (a Dell Precision M3800). The connection is initially successful, but after some time (I couldn't figure out how much: sometimes it's just a few minutes, sometimes hours) it just stops working.

    By stops working here I mean that while apparently the connection is still up, when I try to actually go to some website, or ping some address, nothing initially happens. Notably, at this point everything still shows the connection as up. Both the Network Manager icon, and the outputs of nmcli dev, nmcli g and nmcli dev wifi say that we are successfully connected.

    After some time, while nmcli dev and nmcli g still say everything is fine, nmcli dev wifi now only detects the connection we are supposed to be connected right now (even though I know there are other APs available).

     Trying to reset the connection

    If I do nothing, the situation stays as above. If I now try and reset the connection, we get to the error as per the title. For the purpose, I use sudo service network-manager restart. Here is the state reported by various tools at this point:

    1. ip link still reports the interface as up, with a line of the form ... wlp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> ... (note the UP portion).
    2. On the other hand, iw reports the interface as down: the output of iw dev wlp6s0 link is Not connected.
    3. nmcli reports the interface as down. In particular, nmcli dev reports the state as disconnected, and same for nmcli g. Interestingly, nmcli g still reports the WIFI as enabled. This is confirmed by the output of nmcli radio, which reports everything as enabled.
    4. The Network Manager icon just says disconnected with no visible option to reconnect.

    Try to bring the interface up again, get RTNETLINK error

    At this point I'm kind of short of ideas, so I just try and reset the connection via ip, which still doesn't register it as down. I use sudo ip link set wlp6s0 down and then sudo ip link set wlp6s0 up. The first command succeed, as confirmed by the output of ip link. The second command however fails with

    RTNETLINK answers: Input/Output error

    Additional interesting information is given by dmesg. When the error occurs, a whole lot of errors are given by iwlwifi. I uploaded the whole dmesg dump on this gist. The error is most likely to be traced to what is going on at around L1049. After the connection is dead, until I force a reset of the network manager, we get the loop of iwlwifi errors that starts at L1100, and only ends because that is the point at which I run dmesg.

    When I then try to run ip link set wlp6s0 up, and get our beloved error, the following lines are printed in the dmesg:

    [  +9.727062] iwlwifi 0000:06:00.0: Failed to wake NIC for hcmd
    [  +0.000047] iwlwifi 0000:06:00.0: Error sending MAC_CONTEXT_CMD: enqueue_hcmd failed: -5
    [  +0.000006] iwlwifi 0000:06:00.0: Failed to remove MAC context: -5
    [ +13.220958] iwlwifi 0000:06:00.0: Could not load the [0] uCode section
    [  +0.000007] iwlwifi 0000:06:00.0: Failed to start INIT ucode: -5
    [  +0.000002] iwlwifi 0000:06:00.0: Failed to run INIT ucode: -5
    [  +0.000001] iwlwifi 0000:06:00.0: Failed to start RT ucode: -5
    [Feb27 12:59] iwlwifi 0000:06:00.0: Could not load the [0] uCode section
    [  +0.000007] iwlwifi 0000:06:00.0: Failed to start INIT ucode: -5
    [  +0.000002] iwlwifi 0000:06:00.0: Failed to run INIT ucode: -5
    [  +0.000002] iwlwifi 0000:06:00.0: Failed to start RT ucode: -5
    

    Further googling brought me to these two posts on the archlinux forums (Wireless AP stops broadcasting on disconnect and RTNETLINK answers: input/output error, rt3290), which further suggest a problem with the network adapter drivers. They mention in particular reloading the intel wireless drivers with

    sudo modprobe -r iwlwifi
    sudo modprobe iwlwifi
    

    The first command succeed, as confirmed by the output of ip link not showing the wlp6s0 interface anymore. Unfortunately, once the driver has been brought down, it doesn't go up again (well it does go up again in the sense that it shows up again in sudo modprobe | grep iwl, but the interface still doesn't reappear with ip link). The second command fails with no terminal output, and the following dmesg log:

    [Feb27 16:45] Intel(R) Wireless WiFi driver for Linux
    [  +0.000002] Copyright(c) 2003- 2015 Intel Corporation
    [  +0.000830] iwlwifi 0000:06:00.0: loaded firmware version 17.608620.0 op_mode iwlmvm
    [  +0.016423] iwlwifi 0000:06:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0xFFFFFFFF
    [  +0.024695] Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
    [  +0.000012] ------------[ cut here ]------------
    [  +0.000009] WARNING: CPU: 6 PID: 4523 at /build/linux-hwe-4GXcua/linux-hwe-4.13.0/drivers/net/wireless/intel/iwlwifi/pcie/trans.c:1873 iwl_trans_pcie_grab_nic_access+0xe7/0xf0 [iwlwifi]
    [  +0.000001] Modules linked in: iwlmvm(+) mac80211 iwlwifi cfg80211 ccm rfcomm bnep snd_hda_codec_hdmi arc4 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media joydev pn544_mei mei_phy pn544 hci nfc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel hid_multitouch kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl dell_laptop snd_hda_intel btbcm aesni_intel snd_hda_codec dell_smm_hwmon aes_x86_64 btintel bluetooth snd_hda_core snd_hwdep crypto_simd ecdh_generic glue_helper cryptd intel_cstate snd_pcm intel_rapl_perf dell_wmi dell_smbios input_leds dcdbas snd_seq_midi serio_raw wmi_bmof sparse_keymap snd_seq_midi_event snd_rawmidi intel_pch_thermal snd_seq rtsx_pci_ms memstick
    [  +0.000033]  snd_seq_device snd_timer snd acpi_als kfifo_buf mei_me lpc_ich mei shpchp soundcore ie31200_edac industrialio dptf_power int3403_thermal int3406_thermal int3402_thermal dell_smo8800 dell_rbtn processor_thermal_device int3400_thermal int340x_thermal_zone mac_hid acpi_thermal_rel intel_soc_dts_iosf parport_pc ppdev lp parport autofs4 hid_logitech_hidpp hid_logitech_dj usbhid hid i915 nouveau rtsx_pci_sdmmc mxm_wmi ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt psmouse fb_sys_fops ahci libahci drm rtsx_pci wmi video [last unloaded: cfg80211]
    [  +0.000026] CPU: 6 PID: 4523 Comm: modprobe Tainted: G        W       4.13.0-36-generic #40~16.04.1-Ubuntu
    [  +0.000000] Hardware name: Dell Inc. Dell Precision M3800/Dell Precision M3800, BIOS A10 08/17/2015
    [  +0.000001] task: ffff9037d6261740 task.stack: ffffb44d43ef0000
    [  +0.000006] RIP: 0010:iwl_trans_pcie_grab_nic_access+0xe7/0xf0 [iwlwifi]
    [  +0.000001] RSP: 0018:ffffb44d43ef3ae0 EFLAGS: 00010082
    [  +0.000001] RAX: 000000000000003d RBX: ffff9037d77e0018 RCX: 0000000000000000
    [  +0.000001] RDX: 0000000000000000 RSI: ffff9037efb96578 RDI: ffff9037efb96578
    [  +0.000001] RBP: ffffb44d43ef3b00 R08: 0000000000000001 R09: 00000000000005a4
    [  +0.000000] R10: 0000000000000000 R11: 00000000000005a4 R12: 0000000000000000
    [  +0.000001] R13: ffff9037d77e8f20 R14: ffffb44d43ef3b10 R15: ffff9037d77e0230
    [  +0.000001] FS:  00007f82e7a0a700(0000) GS:ffff9037efb80000(0000) knlGS:0000000000000000
    [  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  +0.000001] CR2: 00007ffc8c365b08 CR3: 000000038bc2e004 CR4: 00000000001606e0
    [  +0.000001] Call Trace:
    [  +0.000006]  iwl_read_prph+0x38/0x90 [iwlwifi]
    [  +0.000004]  iwl_pcie_apm_init+0x1c0/0x230 [iwlwifi]
    [  +0.000005]  iwl_trans_pcie_start_hw+0x76/0x1f0 [iwlwifi]
    [  +0.000009]  iwl_op_mode_mvm_start+0x6e4/0xb10 [iwlmvm]
    [  +0.000005]  _iwl_op_mode_start.isra.10+0x4c/0xa0 [iwlwifi]
    [  +0.000004]  iwl_opmode_register+0x6c/0xd0 [iwlwifi]
    [  +0.000002]  ? 0xffffffffc0742000
    [  +0.000007]  iwl_mvm_init+0x35/0x1000 [iwlmvm]
    [  +0.000003]  do_one_initcall+0x55/0x1b0
    [  +0.000003]  ? __vunmap+0x81/0xb0
    [  +0.000002]  ? kmem_cache_alloc_trace+0x154/0x1b0
    [  +0.000001]  ? kfree+0x165/0x170
    [  +0.000003]  do_init_module+0x5f/0x209
    [  +0.000002]  load_module+0x196a/0x1d70
    [  +0.000002]  ? ima_post_read_file+0x7d/0xa0
    [  +0.000003]  SYSC_finit_module+0xe5/0x120
    [  +0.000001]  ? SYSC_finit_module+0xe5/0x120
    [  +0.000002]  SyS_finit_module+0xe/0x10
    [  +0.000003]  entry_SYSCALL_64_fastpath+0x24/0xab
    [  +0.000001] RIP: 0033:0x7f82e75384d9
    [  +0.000001] RSP: 002b:00007ffc8c368c08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    [  +0.000001] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f82e75384d9
    [  +0.000001] RDX: 0000000000000000 RSI: 000055cbf8d6626b RDI: 0000000000000001
    [  +0.000000] RBP: 00007ffc8c367c10 R08: 0000000000000000 R09: 0000000000000000
    [  +0.000001] R10: 0000000000000001 R11: 0000000000000246 R12: 000055cbf914aad0
    [  +0.000001] R13: 00007ffc8c367bf0 R14: 0000000000000005 R15: 0000000000040000
    [  +0.000001] Code: 00 00 e8 9d b9 25 dd eb ab 48 89 df be 24 00 00 00 c6 05 69 f1 01 00 01 e8 67 eb fe ff 48 c7 c7 c8 ce 8c c0 89 c6 e8 5a 47 a3 dc <0f> ff eb c1 0f 1f 44 00 00 0f 1f 44 00 00 55 49 c7 c0 08 cf 8c 
    [  +0.000024] ---[ end trace a5e22ad3df2362ea ]---
    [  +1.001198] iwlwifi 0000:06:00.0: Could not load the [0] uCode section
    [  +0.000007] iwlwifi 0000:06:00.0: Failed to start INIT ucode: -5
    [  +1.965039] iwlwifi 0000:06:00.0: Failed to run INIT ucode: -5
    

    Conclusion/Question

    I'm guessing a probable cause of all this is a driver bug. Still I don't understand why it only happens with some networks and not with others.

    Also, given that rebooting the laptop does solve the issue, at least temporarily, shouldn't there be a way to simulate what's going on during the rebooting, driver-wise, so to be able to do it without the rebooting itself?

    Further hardware details

    Network adapter model details, as given by lspci -k | grep -A3 Network:

    06:00.0 Network controller: Intel Corporation Wireless 7260 (rev 6b)
        Subsystem: Intel Corporation Dual Band Wireless-AC 7260
        Kernel driver in use: iwlwifi
        Kernel modules: iwlwifi
    

    The device is a Dell Precision M3800 laptop, running ubuntu 16.04 64-bit.

     Additional relevant information

    1. No other device that I know of gives the same problem with this connection, also, the same device mostly doesn't show this problem on any other wifi network. It does happen sometimes, but it is rare and usually easily solved by resetting the connection. This seems to imply the problem must lie in the combination between this particular device and this particular network, which is what makes it so nasty.

    2. I used to have windows 10 on this same device, and it gave the same problem. It was arguably even worst, as if I tried to reboot while the wifi was not working, the laptop would often hang during the reboot and end up in a BSOD. After I tried to install different network adapter drivers this was somewhat less common, but still to bring the interface up again I have to go through a weird procedure of disabling and reenabling the network adapter via the control panel multiple times.

     Other posts where this issue was brought up

    • Unable to set up wifi via systemd-networkd (archlinux forums). Not completely relevant to my case as the OP there is using systemd-networkd. The message is given when trying to bring the interface up via ip link set wlp2s0f0 up. OP claims to have solved the problem correcting a faulty wpa_supplicant configuration file, which I don't if it applies to the present case.

    • [Solved] Problem getting wireless to connect (archlinux forums). The error message is again given trying to bring the interface up with ip link set wlp2s0 up. The cause seemed to be a conflict between NetworkManager and dhcpcd that were both running. OP claims to have solved the problem disabling dhcpcd via systemctl. Probably doesn't apply here.

    • Wifi not working RTNETLINK answers: Input/output error (archlinux forums). Possibly same problem as above, but OP abandoned the post.

    • [Solved] RTNETLINK answers: Input/output error (archlinux forums). Error given by ip link set wlo1 up. No solutions reached.

    • wifi doesn't work with various error messages [closed] (askubuntu). Error given by ip link set wlo1up. Post closed an unclear.

    • Wireless disconnects without a reason and stops working (ubuntuforums). The suggested solution was to use sudo sed -i 's/wifi.powersave = 3/wifi.powersave = 2/' /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf to disable Network Manager from enabling wireless power management. OP never came back to the post to say whether it worked for him, but this didn't work for me.

    • Several other posts can be found, especially on the arch linux forums and on askubuntu.

    • glS
      glS about 6 years
      @dsstorefile actually yes! In fact, I'm still writing the post. I have some relevant dmesg output and other technical info to add, but I needed to do that from the affected device (which is not easy due the aforementioned connectivity problem :P). The easiest way was to post the above draft from another computer and finish it copy pasting relevant info from here. I'm going to also add those details
    • glS
      glS about 6 years
      @dsstorefile done. How did you guessed the network controller model? Did you encounter this before?
    • Michael Martinez
      Michael Martinez over 4 years
      Thanks for the write up, it gave me some things to look at on troubleshooting some of our hosts. We have a bunch of hosts with two bonded interfaces, some of them the link is down in one so what I ended up doing is just running an ip link set dev down; ip link set dev up to see if it would get the carrier back. This worked in a couple cases. For the ones that were still down just turned it over to SiteOps to swap card/cable or whatever it is they do to fix the NIC.
  • chili555
    chili555 about 6 years
    "so this seems to be the end of the road." Have you tried reseating the card in its slot? Have you considered replacing the card? They are inexpensive at Amazon and Ebay.
  • glS
    glS about 6 years
    @chili555 yea by end of the road I mean from a software perspective. I will try to adjust the card and/or replace if needed.
  • Giorgio Vitanza
    Giorgio Vitanza over 3 years
    It is not a problem related to the hardware in my case as: 1. Every time I reboot the wifi is back (just for few minutes), and 2. It is not having the issue with windows, so it's certainly related to the driver. And I also opened up the laptop and the hardware was looking untouched, so I guess this is not the real issue here...
  • glS
    glS over 3 years
    @GiorgioVitanza I'm pretty sure it was behaving the same for me. I agree it cannot be only the hardware, but it seemed to be some interaction between hardware and linux drivers. At least that's what the driver maintainers seemed to think, see comments in the bug reports. Anyway, I bought an external usb dongle and that "fixed" the problem for me
  • Giorgio Vitanza
    Giorgio Vitanza over 3 years
    I just switched back to Ubuntu 18.04 as it's more stable i think in many features. Also I noticed that I don't have this wifi problem if I turn secure boot off from BIOS.. I don't know if the driver i was using is proprietary or not, but secure boot has some impact on the drivers, so maybe try doing this and let me know, if you want to give it a try, but also you should keep in mind that you won't be able to use any other proprietary driver when turning secure boot off. To do it just go to the bios, go to the boot menu, and disable secure boot
  • Giorgio Vitanza
    Giorgio Vitanza over 3 years
    @gIS I just wanted to add last comment on the matter because I think that there is an information that can be useful to everyone. I actually had the same problem with windows, and it was related to that particular connection, because it's not happening with new connection. But here is the main aspect to have a look on within this issue: when it happened with windows, somehow there was some automatic reboot of the wifi and after few seconds the connection was available again. There could be a missing line for a condition on the script for ubuntu's driver that would fix the problem in such way.