Networking doesn't initialize properly when pxebooting Linux Mint (live CD) using cifs, but works with nfs

5,468

This problem has been solved by Serva (I'm related to Serva development)

The complete kernel and append lines plus the additional initrd.gz required for PXE booting current Ubuntu/Mint live versions with CIFS can be found here

Basically the problem is a Casper bug (AFAIK never reported/fixed before) that in the case of a CIFS netmount forgets to export a kernel parameter that later affects the networking configuration scripts that end up recreating with delays and errors the file /etc/network/interfaces.

If we see Serva's Ubuntu/Mint "append" line

append   = showmounts toram root=/dev/cifs initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ boot=casper netboot=cifs nfsroot=//$IP_BSRV$/NWA_PXE_SHARE/$HEAD_DIR$ NFSOPTS=-ouser=serva,pass=avres,ro ip=dhcp ro

we find that the embedded "initrd" variable is made of 2 "consecutively loaded" initrd files (initrd.lz and INITRD_N11.GZ)

initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ 

The first one (initrd.lz) is the one coming with Ubuntu/Mint while the second one (INITRD_N11.GZ) is a tiny 8K (originally developed by Serva) custom initrd including the patched components. This approach avoids the need to recreate the big original initrd.lz (20 MB). INITRD_N11.GZ can be freely downloaded from Serva's site (please do not post direct links here)

If we continue analyzing the "append" line we see the need to add the CIFS mounting options (the OP forgets this step) that are carried in this case by the somehow misleading variable "NFSOPTS"

NFSOPTS=-ouser=serva,pass=avres,ro

In this example the SMB share has a user=serva with password=avres and it'll be mounted as "Read Only", off course user/pass parameters must be edited accordingly.

The TFTP paths and CIFS locator are the ones required by Serva repository structure; when the PXE server is not Serva those parameters must be edited accordingly.

If you guys PXE boot this way Ubuntu/Mint Live versions from a CIFS share there will be no network related delays and Internet/Networking will work right away after boot

Edit:

Bug already reported to Ubuntu Launchpad and confirmed

Share:
5,468

Related videos on Youtube

dialer
Author by

dialer

Updated on September 18, 2022

Comments

  • dialer
    dialer almost 2 years

    I have a TFTP/DHCP/NFS/SMB server (Ubuntu server 12.04 LTS) on 192.168.26.1. I use pxelinux to display a menu containing startup and installation options for Windows, an Ubuntu network installer, and the Linux Mint 17 MATE live CD. Getting it running like this was already nasty and I'm running out of steam...

    For Linux Mint, I have provided 2 netboot options: NFS and CIFS. I got it fully working with NFS: The user can select it in the boot menu, and a short while later, lands on the Linux Mint live CD desktop. But with CIFS, networking doesn't initialize properly. When Linux Mint starts, the networking hangs for 120 seconds. Then, it continues to boot to the Desktop, but net network-manager isn't started (and doesn't start). I suspected that it might be a problem with the DHCP server not responding, however, in the DHCP server log I can see the DHCP request and successful response.

    Once in the Linux Mint desktop, ifconfig reports an IP address that is assigned by the DHCP, and pinging the server works.

    My pxelinux configuration is (everything after APPEND is in one line, I just split it up for readability on this site):

    NFS:

    LABEL linuxmint17
        MENU LABEL Linux Mint 17
        KERNEL linux-mint-17/image/casper/vmlinuz
        APPEND 
            root=/dev/nfs boot=casper netboot=nfs
            nfsroot=192.168.26.1:/var/lib/tftpboot/linux-mint-17/image
            initrd=/linux-mint-17/image/casper/initrd.lz
    

    CIFS:

    LABEL linuxmint17smb
        MENU LABEL Linux Mint 17 (SMB)
        KERNEL linux-mint-17/image/casper/vmlinuz
        APPEND
            root=/dev/cifs boot=casper netboot=cifs
            nfsroot=//192.168.26.1/tftpshare/linux-mint-17/image
            ip=dhcp
            initrd=/linux-mint-17/image/casper/initrd.lz
    

    Note that I had to insert the ip=dhcp option to the CIFS menu. If I don't do that, the boot process hangs for 120 seconds when initializing Networking, but then it doesn't continue. If I add that line, it still hangs, but after 120 seconds it continues to boot.

    The setup:

    The client and server virtual machines are only connected to each other (internal network). There are no other machines in the network at all.

    The server has all the pxe boot files under /var/lib/tftpboot/. The Linux Mint ISO (unmodified) is mounted under /var/lib/tftpboot/linux-mint-17/image. vmlinuz and initrd are in /var/lib/tftpboot/linux-mint-17/image/casper. /var/lib/tftpboot/ is an NFS export. There is a samba share called tftpshare that maps to /var/lib/tftpboot/ (read-only, allows access to everyone).

    smb.conf

    [tftpshare]
       comment = TFTP Root
       path = /var/lib/tftpboot
       browsable = yes
       guest ok = yes
       read only = no
       create mask = 0644
    

    dhcpd.conf

    authoritative;
    subnet 192.168.26.0 netmask 255.255.255.0 {
      range 192.168.26.10 192.168.26.40;
      next-server 192.168.26.1;
      filename "pxelinux.0";
    }
    

    This is a strange 2 minute gap in the syslog of the client machine after a successful boot to the live desktop environment:

    Jun 14 13:13:18 mint kernel: [   23.388873] intel_rapl: domain core energy ctr 0:0 not working, skip
    Jun 14 13:13:18 mint kernel: [   23.528409] intel_rapl: domain uncore energy ctr 0:0 not working, skip
    Jun 14 13:13:18 mint kernel: [   23.528453] intel_rapl: no valid rapl domains found in package 0
    Jun 14 13:13:20 mint ntpdate[1198]: Can't find host ntp.ubuntu.com: Name or service not known (-2)
    Jun 14 13:13:20 mint ntpdate[1198]: no servers can be used, exiting
    

    (2 Minute gap without any entries, roughly at the time when the 120 second boot delay occurs)

    Jun 14 13:15:19 mint dbus[864]: [system] Activating service name='org.freedesktop.ConsoleKit' (using servicehelper)
    Jun 14 13:15:19 mint dbus[864]: [system] Activating service name='org.freedesktop.PolicyKit1' (using servicehelper)
    Jun 14 13:15:19 mint acpid: starting up with netlink and the input layer
    Jun 14 13:15:19 mint acpid: 9 rules loaded
    Jun 14 13:15:19 mint acpid: waiting for events: event logging is off
    

    This is what happens in both cases when using CIFS:

    Hangs

    On the server:

    ...
    Jun 14 13:12:52 ubuntu-netboot in.tftpd[2722]: RRQ from 192.168.26.13 filename /linux-mint-17/image/casper/initrd.lz
    Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPDISCOVER from 08:00:27:1c:c5:43 via eth1
    Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPOFFER on 192.168.26.14 to 08:00:27:1c:c5:43 via eth1
    Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPREQUEST for 192.168.26.14 (192.168.26.1) from 08:00:27:1c:c5:43 via eth1
    Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPACK on 192.168.26.14 to 08:00:27:1c:c5:43 via eth1
    

    The IP that is assigned to the client in case of a successful boot to the desktop, according to ifconfig, is indeed ...14.

    This is what happens without the ip=dhcp:

    nodhcp1 nodhcp2

    This is what happens with the ip=dhcp, immediately before the Desktop shows:

    success

    I'm thankful for any ideas. If any other logs (which?) would help, I can provide them.

    • warren
      warren about 10 years
      this is how a question should be written :)
    • lacasitos
      lacasitos about 10 years
      Did you try to tcpdump on the server to see if you get anything from the client?
    • hookenz
      hookenz about 10 years
      Have a look at the casper boot scripts. I think the issue is there. Did you regenerate your initramfs? I assume you have BOOT=casper set?
    • Pat
      Pat about 10 years
      Matt, you can clearly see the OP has boot=casper set. Regenerate initramfs what for??
    • warren
      warren about 10 years
      Does @Pat 's answer work for you, dialer?
    • user.dz
      user.dz about 10 years
      eth1? Do you have two net interfaces?
    • dialer
      dialer about 10 years
      @warren I just tried it and it worked. @Sneetsher I connect eth0 to my gateway if I need internet access.
  • dialer
    dialer about 10 years
    Appending the INITRD_N11.GZ from Serva's site as you mentioded did it. I haven't included any NFSOPTS because my samba server doesn't use authentication.
  • Pat
    Pat about 10 years
    Good; just to mention when you boot using CIFS w/o an specific NFSOPS in the command line, Casper defaults to CIFSOPTS="-ouser=root,password=" but it does not specify "ro" that could have some side effects later; In your case I'd specify i.e. NFSOPTS="-ouser=root,password=,ro"
  • dialer
    dialer about 10 years
    Did the modified initrd image originate from Serva's development team? Or has anyone else actually repaired this bug before?
  • Pat
    Pat about 10 years
    INITRD_N11.GZ is a Serva development, see the edited answer, you will find the the link to the bug report I have performed. If you have an Ubuntu Launchpad account you could "verify" the bug; that will help to get this fixed in future releases.