iptables - forward inbound traffic to internal ip (docker interface)

5,076

There are two issues (and actually a non-asked 3rd that I will address with a simple if not best solution, just in case, to be thorough):

Locally initiated packets are not forwarded/routed

Locally initiated packets are not forwarded (routed). So those packets never see the nat/PREROUTING chain. Take a look at Packet flow in Netfilter and General Networking to get an idea of what happens during the life of a packet in the kernel. Local packets come from "local process".

So in addition to the nat/PREROUTING rule doing the DNAT for packets arriving from "outside", which should look like:

iptables -t nat -I PREROUTING -i eno1 -p tcp --dport 8443 -j DNAT --to-destination 172.17.0.2:8443

You also have to use the nat/OUTPUT chain. As it's output, its syntax only allows outgoing interfaces, so it's altered like this:

iptables -t nat -I OUTPUT -o lo -p tcp --dport 8443 -j DNAT --to-destination 172.17.0.2:8443

The initial packet and then flow will be actually rerouted to an other interface (I suspect the "reroute check" in the previous link's schematic might not be placed correctly).

This will work with any IP belonging to the host (ie: 172.16.214.45 and 172.17.0.1), except...

the IP range 127.0.0.0/8 is forbidden to be seen outside of the lo interface

The Linux kernel has specific settings preventing any IP in the range 127.0.0.0/8 to be routed anywhere else than to the lo interface and drops any such packet as martian source if "attempting" to use an other interface, and rightly so: the remote system (even if it's a container) would not accept an incoming packet with source 127.0.0.1 and destination 172.17.0.2 at least because it wouldn't know where to reply to it.

So a SNAT (or simple MASQUERADE) to the packet in addition to the DNAT must also be made, this time in the nat/POSTROUTING chain which is traversed (see the previous schematic):

iptables -t nat -I POSTROUTING -s 127.0.0.1 -d 172.17.0.2 -j MASQUERADE

This is still not enough: as the name implies, nat/POSTROUTING happens after the routing (actually the reroute check happening after the DNAT), and the packet was already dropped as martian source.

For special cases, like this one, it's possible to override the localnet restriction with the per-interface toggle route_localnet:

echo 1 > /proc/sys/net/ipv4/conf/docker0/route_localnet

Now the routing stack lets the packets with source 127.0.0.1 pass, and their source is corrected to 172.17.0.1 by the previous rule before going out on the virtual wire to the container: it works.

You really should avoid anything requiring this second case because it's unneeded complexity: using an IP belonging to the host and not 127.0.0.1 should be enough for any test. Also if the docker0 interface were to be deleted and recreated, the route_localnet setting will be lost, and it wouldn't be wise to set it as default.

Hairpinning

Not asked, but if you add a second system (here a container) in the same LAN, there are issues with lan-to-host-to-same-lan redirections (unless Docker is already handling this at the network level).

The nat/PREROUTING rule I wrote at the start of the answer handles only the eno1 interface. There was a reason I added this -i eno1 restriction: without it, if an other container in the 172.17.0.0/16 network attempts to connect for example to 172.16.214.45:8443 (or to 172.17.0.1:8443), the packet will be redirected to 172.17.0.2. 172.17.0.2 will then reply directly to the source: the other container, and bypass completely the host and its NAT rules. That container will see a reply packet coming from a source it doesn't know about and reject it (using TCP RST). So better not handle it at all than handle it bad. Docker probably provides specific ways to resolve directly a service to an other container's IP/port without involving the host.

If needed anyway, there are several methods to overcome this, often with tradeoff, from simple NAT (which loses the source IP or has to translate it to a fictious network, for logging purpose) to complex bridge and/or router settings able to intercept the LAN communication.

Here's a simple solution where the source is SNAT-ed, using NETMAP, to the fictious network 10.17.0.0/16. A simple prerequisite: 10.17.0.0/16 must probably be routed on the host (even if not really used), either on the default route (probably the case), a specific route or with the host having an IP in this fictious net for this purpose. Packets with this IP will only exist inside docker0's network.

After removing the -i eno1 from the PREROUTING rule above, add this new rule:

iptables -t nat -I POSTROUTING -s 172.17.0.0/16 -d 172.17.0.0/16 -j NETMAP --to 10.17.0.0/16

Now the redirection from LAN to same LAN will work, with destination container's logs showing source IPs in the 10.17.0.0/16 range.

Of course, hairpinning situations should also be avoided.

Share:
5,076

Related videos on Youtube

Alexander. C
Author by

Alexander. C

Updated on September 18, 2022

Comments

  • Alexander. C
    Alexander. C over 1 year

    I have an iptables specific question. I have the following network interfaces defined on my machine:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether f8:ca:b8:5c:59:b5 brd ff:ff:ff:ff:ff:ff
    inet 172.16.214.45/24 brd 172.16.214.255 scope global dynamic eno1
       valid_lft 773635sec preferred_lft 773635sec
    3: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 80:00:0b:d7:a8:c5 brd ff:ff:ff:ff:ff:ff
    4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:bf:b2:fa:86 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    

    I have one active docker container listening on the ip 172.17.0.2 (attached to the docker0 interface)

    I would like to do two things:

    1. Forward all incoming packets on my machine on port 8443 to the docker container ip 172.17.0.2 on its port 8443
    2. Forward all loopback packets on the lo interface to the docker container ip 172.17.0.2 on port 8443

    I have done this, but it's not working when testing on the loopback interface

    iptables -t nat -I PREROUTING -i lo -d 127.0.0.1 -p tcp --dport 8443 -j DNAT --to-destination 172.17.0.2:8443
    
    $ curl https://localhost:8443
    curl: (7) Failed to connect to localhost port 8443: Connection refused
    
    $ curl -k https://172.17.0.2:8443
    {
      "paths": [
        "/api"
      ]
    }
    

    Any indications on what I am doing wrong from experienced iptables people?