UDP traffic not forwarded from Docker containers -> Docker host

16,895

Solution 1

I figured it out.

We had a Trend Micro (anti-virus) agent running in the SOE which I didn't know about.

Fixing it was as simple as:

# systemctl stop ds_agent.service
# pkill ds_agent

Not exactly sure at this point why it is blocking UDP from containers or how to stop it.

Solution 2

It seems you have a modprobe install directive that cannot work. Possibly it's a result of incomplete update to RHEL 7.2 or some manual fixes.

Try grep -r bridge /etc/modprobe.d /lib/modprobe.d for starters, or otherwise dig around /etc/modprobe.d or /lib/modprobe.d and try to find where does it define the install rule that calls sysctl -q -w net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-ip6tables=0

This sysctl is clearly in wrong place. It is either superfluous or should appear after br_netfilter, not before. Why? Recently the /proc/sys/net/bridge handling has been moved from the bridge module to the br_netfilter module. This happens with some version of kernel*.rpm, while the contents of modprobe.d directories are distributed with other individual packages. I've verified on my RHEL 7.2:

# modprobe bridge
# sysctl -q -w net.bridge.bridge-nf-call-iptables=0
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
# modprobe br_netfilter
# sysctl -q -w net.bridge.bridge-nf-call-iptables=0    # ok now

I don't see these "broken" rules on my vanilla RHEL 7.1 and their origin is mysterious to me. I've tried:

# modprobe -n -vvv bridge
modprobe: INFO: custom logging function 0x40a130 registered
insmod /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/llc/llc.ko
insmod /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/802/stp.ko
insmod /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/bridge/bridge.ko
modprobe: INFO: context 0xf1c270 released
# echo "install bridge echo example_of_a_modprobe_rule" > /etc/modprobe.d/zzz5.conf
# modprobe -n -vvv bridge
modprobe: INFO: custom logging function 0x40a130 registered
insmod /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/llc/llc.ko
insmod /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/802/stp.ko
install echo example_of_a_modprobe_rule
modprobe: INFO: context 0xeaa270 released
# rm /etc/modprobe.d/zzz5.conf

Update: Looks like xenserver uses a similar modprobe hack. It's a nasty bug to globally change kernel module behavior for everyone whether you actually run xenserver or not; and the bug has fired back at us.

Update 2: Now, you've found that /etc/modprobe.d/dist.conf causes this problem and not docker. Whether you have docker or not, modprobe bridge will always return 1 and print error. Normally dist.conf is a part of module-init-tools package on RHEL6. This file is not supposed to be used on RHEL7. It's not on any of my RHEL7 systems and they run just fine. In RHEL7 the package is kmod and it doesn't contain dist.conf. I would:

rpm -qf /etc/modprobe.d/dist.conf  # what package owns this file?

If dist.conf is not owned by package, backup it and delete any lines that don't give you any obvious benefit (maybe even delete the file altogether).

If dist.conf is owned by a package, consider removing/updating that package, since it became clearly buggy in terms of RHEL 7.2 compatibility.

Share:
16,895
Alex Harvey
Author by

Alex Harvey

Some of my Stack Overflow contributions: aws Images with EC2 BillingProduct codes cannot be copied to another AWS account. How to output a machine's Public IP in AWS CloudFormation. sam Implicit APIs vs AWS::Serverless::Apis. How to recover from a manually deleted resource in SAM or CloudFormation. How to connect SAM local to an external docker network. serverspec Why does Serverspec use should syntax and how do I use expect. puppet Stop Rspec-puppet cleaning up after failed test run. Chaining execs to be triggered by changes in a file. Difference between puppet apply and puppet agent -t. Iterate over an array in Puppet. vim Align text on an equals sign in vim. Alphabetically sort lines between 2 patterns in vim. regex Validating IPv4 address using a regex. bash Generate IP addresses in a subnet in Red Hat Linux. sed Print lines between 2 patterns exclusive. perl How to print lines between 2 patterns in a Perl one-liner.

Updated on September 18, 2022

Comments

  • Alex Harvey
    Alex Harvey almost 2 years

    I have a docker container and I am unable to run DNS lookups from inside containers, although it works fine from the docker host.

    The configuration management code that builds the Docker host is known to work on a standard RHEL 7 image from the marketplace, therefore the problem is known to be something inside the SOE RHEL 7 image.

    RHEL 7.2 / Docker version 1.12.6, build 88a4867/1.12.6. Container is RHEL 7.3. SELinux in enabled/permissive mode. The Docker host is an Amazon EC2 instance.

    Some config:

    # /etc/sysconfig/docker
    OPTIONS='--dns=10.0.0.10 --dns=10.0.0.11 --dns-search=example.com'
    DOCKER_CERT_PATH=/etc/docker
    ADD_REGISTRY='--add-registry registry.example.com'
    no_proxy=169.254.169.254,localhost,127.0.0.1,registory.example.com
    http_proxy=http://proxy.example.com:8080
    https_proxy=http://proxy.example.com:8080
    ftp_proxy=http://proxy.example.com:8080
    

    Resolver config in the container and host is the same:

    # /etc/resolv.conf
    search example.com
    nameserver 10.0.0.10
    nameserver 10.0.0.11
    

    If I restart the docker daemon with --debug I see the following in journalctl -u docker.service:

    Aug 08 11:44:23 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:23.430769581+10:00" level=debug msg="Name To resolve: http://proxy.example.com."
    Aug 08 11:44:23 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:23.431488213+10:00" level=debug msg="Query http://proxy.example.com.[1] from 172.18.0.6:38189, forwarding to udp:10.162.182.101"
    Aug 08 11:44:27 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:27.431772666+10:00" level=debug msg="Read from DNS server failed, read udp 172.18.0.6:38189->10.162.182.101:53: i/o timeout"
    

    Following that observation further, it turns out I can get some networking to work if I specify an IP address instead of the DNS name of the proxy; although that really is just a way of avoiding using DNS and not a real fix.

    Indeed, (update #3) it turns out I can avoid the issue completely by simply configuring DNS to use TCP instead of UDP, i.e.

    # head -1 /etc/sysconfig/docker
    OPTIONS="--dns=10.0.0.10 --dns=10.0.0.11 --dns-search=example.com --dns-opt=use-vc"
    

    (Adding a line use-vc tells the resolver to use TCP instead of UDP.)

    I did note some suspicious-looking rules in iptables, but these turned out to be normal:

    # iptables -n -L DOCKER-ISOLATION -v --line-numbers
    Chain DOCKER-ISOLATION (1 references)
    num   pkts bytes target     prot opt in     out     source               destination         
    1        0     0 DROP       all  --  br-1d6a05c10468 docker0  0.0.0.0/0            0.0.0.0/0           
    2        0     0 DROP       all  --  docker0 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
    3    34903   11M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    

    After deleting those two DROP rules, I continued to see the issue.

    Full iptables:

    # iptables -nL -v
    Chain INPUT (policy ACCEPT 2518 packets, 1158K bytes)
     pkts bytes target     prot opt in     out     source               destination         
    
    Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
     pkts bytes target     prot opt in     out     source               destination         
    23348 9674K DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
        0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
        0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
        0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
    23244 9667K DOCKER     all  --  *      br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
    23232 9667K ACCEPT     all  --  *      br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
      104  6230 ACCEPT     all  --  br-1d6a05c10468 !br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
       12   700 ACCEPT     all  --  br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
    
    Chain OUTPUT (policy ACCEPT 2531 packets, 414K bytes)
     pkts bytes target     prot opt in     out     source               destination         
    
    Chain DOCKER (2 references)
     pkts bytes target     prot opt in     out     source               destination         
        0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.2           tcp dpt:443
        0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.2           tcp dpt:80
        0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.3           tcp dpt:389
    
    Chain DOCKER-ISOLATION (1 references)
     pkts bytes target     prot opt in     out     source               destination         
        0     0 DROP       all  --  br-1d6a05c10468 docker0  0.0.0.0/0            0.0.0.0/0           
        0     0 DROP       all  --  docker0 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
    23348 9674K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    

    Bridge config

    # ip addr show docker0
    4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
        link/ether 02:42:a8:73:db:bb brd ff:ff:ff:ff:ff:ff
        inet 172.17.0.1/16 scope global docker0
           valid_lft forever preferred_lft forever
    # ip addr show br-1d6a05c10468
    3: br-1d6a05c10468: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
        link/ether 02:42:d5:b6:2d:f5 brd ff:ff:ff:ff:ff:ff
        inet 172.18.0.1/16 scope global br-1d6a05c10468
           valid_lft forever preferred_lft forever
    

    and

    # docker network inspect bridge 
    [
        {
            "Name": "bridge",
            "Id": "e159ddd37386cac91e0d011ade99a51f9fe887b8d32d212884beace67483af44",
            "Scope": "local",
            "Driver": "bridge",
            "EnableIPv6": false,
            "IPAM": {
                "Driver": "default",
                "Options": null,
                "Config": [
                    {
                        "Subnet": "172.17.0.0/16",
                        "Gateway": "172.17.0.1"
                    }
                ]
            },
            "Internal": false,
            "Containers": {},
            "Options": {
                "com.docker.network.bridge.default_bridge": "true",
                "com.docker.network.bridge.enable_icc": "true",
                "com.docker.network.bridge.enable_ip_masquerade": "true",
                "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
                "com.docker.network.bridge.name": "docker0",
                "com.docker.network.driver.mtu": "1500"
            },
            "Labels": {}
        }
    ]
    

    In the logs:

    Aug 04 17:33:32 myhost.example.com systemd[1]: Starting Docker Application Container Engine...
    Aug 04 17:33:33 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:33.056770003+10:00" level=info msg="libcontainerd: new containerd process, pid: 2140"
    Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:34.740346421+10:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
    Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:34.741164354+10:00" level=info msg="Loading containers: start."
    Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: .........................time="2017-08-04T17:33:34.903371015+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:35 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:35.325581993+10:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address" 
    Aug 04 17:33:36 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:36+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:37 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:37+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:37 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:37+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:38 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:38+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:39 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:39+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:40 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:40+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:40 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:40+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:42 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:42+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:42 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:42+10:00" level=info msg="Firewalld running: true"
    Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541905145+10:00" level=info msg="Loading containers: done."
    Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541975618+10:00" level=info msg="Daemon has completed initialization"
    Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541998095+10:00" level=info msg="Docker daemon" commit="88a4867/1.12.6" graphdriver=devicemapper version=1.12.6
    Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.548508756+10:00" level=info msg="API listen on /var/run/docker.sock"
    Aug 04 17:33:43 myhost.example.com systemd[1]: Started Docker Application Container Engine.
    

    From the container, I can ping the default gateway but all name resolution fails.

    I noticed one weird thing in the log (Update #2 I now know that this is a red herring - see discussion below):

    # journalctl -u docker.service |grep insmod > /tmp/log # \n's replaced below
    Jul 26 23:59:02 myhost.example.com dockerd-current[3185]: time="2017-07-26T23:59:02.056295890+10:00" level=warning msg="Running modprobe bridge br_netfilter failed with message: insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/bridge/bridge.ko 
    sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-arptables: No such file or directory
    sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
    sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
    modprobe: ERROR: Error running install command for bridge
    modprobe: ERROR: could not insert 'bridge': Unknown error 253
    insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/llc/llc.ko 
    insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/802/stp.ko 
    install /sbin/modprobe --ignore-install bridge && /sbin/sysctl -q -w net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-ip6tables=0 
    insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/bridge/br_netfilter.ko 
    , error: exit status 1"
    

    Update #1: and this is coming from:

    # tail -2 /etc/modprobe.d/dist.conf
    # Disable netfilter on bridges when the bridge module is loaded
    install bridge /sbin/modprobe --ignore-install bridge && /sbin/sysctl -q -w net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-ip6tables=0
    

    Also:

    # cat /proc/sys/net/bridge/bridge-nf-call-{arp,ip,ip6}tables
    1
    1
    1
    

    However, even after I do this:

    # for i in /proc/sys/net/bridge/bridge-nf-call-{arp,ip,ip6}tables ; do echo 0 > $i ; done 
    

    Still no luck.

    I spent a whole day on this so pulling my hair out by now. Any thoughts on what else I could try or what else the problem might be much appreciated.

    Update #4

    I performed some experiments using Netcat and I have proved that all UDP packets are not forwarded if sent from any container -> host. I tried using several ports including 53, 2115, and 50000. TCP packets are fine however. This is still true if I flush the iptables rules with iptables -F.

    Furthermore, I can send UDP packets from one container to another - only UDP traffic from container -> host is not forwarded.

    To set up the test:

    On the host, which has IP 10.1.1.10:

    # nc -u -l 50000
    

    On the container:

    # echo "foo" | nc -w1 -u 10.1.1.10 50000
    

    During a TCP dump capture I see:

    17:20:36.761214 IP (tos 0x0, ttl 64, id 48146, offset 0, flags [DF], proto UDP (17), length 32)
        172.17.0.2.41727 > 10.1.1.10.50000: [bad udp cksum 0x2afa -> 0x992f!] UDP, length 4
            0x0000:  4500 0020 bc12 4000 4011 53de ac11 0002  E.....@[email protected].....
            0x0010:  0aa5 7424 a2ff c350 000c 2afa 666f 6f0a  ..t$...P..*.foo.
            0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    17:20:36.761214 IP (tos 0x0, ttl 64, id 48146, offset 0, flags [DF], proto UDP (17), length 32)
        172.17.0.2.41727 > 10.1.1.10.50000: [bad udp cksum 0x2afa -> 0x992f!] UDP, length 4
            0x0000:  4500 0020 bc12 4000 4011 53de ac11 0002  E.....@[email protected].....
            0x0010:  0aa5 7424 a2ff c350 000c 2afa 666f 6f0a  ..t$...P..*.foo.
            0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    

    I tried unsuccessfully to fix the bad UDP checksums via this.

    I noted, however, that the bad UDP checksums are seen even during the successful transmission of UDP packets (host -> host) and container -> container.

    In summary, I now know:

    • routing is fine

    • iptables is flushed

    • SELinux is permissive

    • all TCP works in all directions

    • all UDP from container -> container is fine

    • all UDP from host -> host is fine

    • all UDP from host -> container is fine

    • BUT no UDP packets from container -> host are forwarded

    • Orphans
      Orphans almost 7 years
      Can your containers talk on port 53? telnet 8.8.8.8 53
    • Alex Harvey
      Alex Harvey almost 7 years
      I can't get to 8.8.8.8 (TCP) port 53 because 8.8.8.8 is blocked by a company firewall. I can, however, connect to my local DNS server on TCP port 53 -- see my update. In the morning I intend to figure out a way to use netcat to try to prove (what I currently believe) that the problem is really that these containers simply do not forward outbound UDP traffic.
    • Alex Harvey
      Alex Harvey almost 7 years
      @Orphans, updated with results from my netcat experiments. Basically, all UDP from container -> host is lost, but TCP is fine, and UDP is also fine from container -> container and from host -> container. And all true after iptables -F.
    • Orphans
      Orphans almost 7 years
    • Alex Harvey
      Alex Harvey almost 7 years
      Thanks, but no, that is about port forwarding -- forwarding a port in a container so that a client can connect to it at an address on the Docker host. In my case, the container is trying to send outbound UDP packets.
    • Alex Harvey
      Alex Harvey almost 7 years
      Your suggested iptables command has not had any effect that I can observe. My host is an Amazon EC2 instance, with one interface eth0. If I run tcpdump -i eth0 port 53, the packets are not seen, meaning they were dropped at the bridge. Intuitively, I can't believe the bad UDP checksums are the problem, considering they are present in all directions, but traffic is only dropped in the container -> host direction.
  • Alex Harvey
    Alex Harvey almost 7 years
    The file is indeed not owned by a package. However, I removed the file completely and rebuilt the docker host and I'm afraid I've proven that this isn't causing the DNS connectivity issue. :(
  • Naveed Abbas
    Naveed Abbas almost 7 years
    Bummer :( Sigh...
  • Ed Neville
    Ed Neville over 6 years
    Check that your company policy doesn't require ds_agent to be running. You may want to file a bug with Trend Micro so they can fix it at source.
  • Alex Harvey
    Alex Harvey over 6 years
    Good idea but it wasn't my team that managed Trend Micro and I've left that job now!
  • AnotherHowie
    AnotherHowie over 5 years
    I've had a similar issue with TCP traffic not even showing up in outboard tcpdump output. I'd forgotten that ds_agent was running on this box, and that seems to be the root cause. Good thing, because I had no other ideas left! :-)