How to achieve 2-gigabit total throughput on Linux using the bonding driver?

15,597

Solution 1

It is unlikely you will achieve 2 gigabit without cooperation at the switch level and even then it might be hard with only a single IP source/destination combination. Most teams are set up for IP hashing which allocates a single NIC path to each source/destination. As such you'll only get 1 gigabit. There are round-robin schemes but you can often find out of order packet arrival that make it undesirable unless both the host and destination support that scheme.

Solution 2

You will need Port Aggregation at the switch ports (the two ports of the access switch that are wired to the 2 gigabit ports on your machine need to be aggregated). But, once that is achieved, you should be getting close to a 2Gbps path (limited by the machine's capabilities).

With port aggregation on the switch matching the logical 2Gbps port of the bonding driver you would be using a multiplexed redundant path with just one IP address on the machine.

There are some interesting notes i came across looking this up now, here .

There is a dark side to this wonderful feature of the Linux bonding driver–it only works with network interfaces that allow the MAC address to be changed when the interface is open. The balance-alb mode depends on swift ARP trickery to fool the kernel into thinking the two physical interfaces are one by rewriting the MAC address on the fly. So the driver for the interface must support this, and many of them don't.

But that's not all the bonding driver can do. The mode option gives you seven choices, and you don't have to worry about interface compatibility. However you do need to consider what your switches support. The balance-rr, balance-xor and broadcast modes need switch ports grouped together. This goes by all sorts of different names, so look for "trunk grouping", "etherchannel", "port aggregation", or some such. 802.3ad requires 802.3ad support in the switch.

Solution 3

First, you probably know that you're never actually going to hit 2Gb/s. The overhead of TCP/IP will limit you to probably 90% of the max.

2nd, even if you use a TCP offload engine, the stack above layer 3 definitely affects where the bottleneck is. In other words, how are you transmitting the data? I could have 10Gb/s NICs and a crossover between them and I'd not get above a few hundred Mb/s if I was using rsync over an ssh tunnel.

What else can you tell us about the topology? You said that the server is connected to a couple of switches, and that the remote clients are all over the world. Do you have > 500Mb/s (aggregate) of WAN connections?

Solution 4

We didn't truly resolve this issue. What we did is set up two servers, one bound to an IP on each interface, and then followed the directions here to force traffic to go out the port it came in:

http://kindlund.wordpress.com/2007/11/19/configuring-multiple-default-routes-in-linux/

Slightly modified for our situation. In this example, the gateway is 192.168.0.1 and the server's IPs are 192.168.0.211 and 192.168.0.212 on eth0 and eth1 respectively:

printf "1\tuplink0\n" >> /etc/iproute2/rt_tables
printf "2\tuplink1\n" >> /etc/iproute2/rt_tables

ip route add 192.168.0.211/32 dev eth0 src 192.168.0.211 table uplink0
ip route add default via 192.168.0.1 dev eth0 table uplink0
ip rule add from 192.168.0.211/32 table uplink0
ip rule add to 192.168.0.211/32 table uplink0

ip route add 192.168.0.212/32 dev eth1 src 192.168.0.212 table uplink1
ip route add default via 192.168.0.1 dev eth1 table uplink1
ip rule add from 192.168.0.212/32 table uplink1
ip rule add to 192.168.0.212/32 table uplink1
Share:
15,597

Related videos on Youtube

Antoine Benkemoun
Author by

Antoine Benkemoun

IT Engineer. Have worked as network engineer for Orange Business Services and embedded network engineer for Dassault Aviation.

Updated on September 17, 2022

Comments

  • Antoine Benkemoun
    Antoine Benkemoun almost 2 years

    For this application I am less concerned with high availability than I am with total throughput. I have one IP address on the server end, and I want to be able to send more than 1-gigabit of traffic out from the server. The server has two 1-gigabit cards and is connected to a pair of switches. The application involves thousands of remote clients around the world connecting to the server (i.e. not a local network).

    Currently, bonding is set up using mode 5 (balance-tlb), but the result is that the throughput for each port won't go above 500Mbit/s. How can I get past this limit? Please assume that I have no access to the switches, so I cannot implement 802.3ad.

    (I was hoping to add the "bonding" tag, but I cannot add new tags, so "teaming" it is.)

  • Admin
    Admin about 15 years
    There are a couple handfuls of gigabit WAN connections. Some might be 10GE, not sure. The server easily does 1Gb/s with ~5% idle CPU, I expect it should be able to exceed that.
  • Admin
    Admin about 15 years
    Yeah, I'm afraid we will have to go the 802.3ad route. I'm not opposed to it, but it's one of those things that hasn't been deployed here before, so the time it takes to get it all tested and out the door is increased substantially, whereas if it's just on the host level whatever driver can be enabled and then disabled if it doesn't end up working.
  • Matt Simmons
    Matt Simmons about 15 years
    What changed between the server getting 1Gb/s and only getting 500Mb/s?
  • Admin
    Admin about 15 years
    The server in its original configuration pushes 1Gb/s out of a single port. With the bonding driver set up (mode 5) it pushes 500Mb/s max out of each port. The hope was that it would potentially max out both nics (or at least go above 500).
  • Matt Simmons
    Matt Simmons about 15 years
    Hrm...that's interesting. Have you tried other bonding modes? Mode=0 for example?
  • Matt Simmons
    Matt Simmons about 15 years
    Or mode 6, for that matter?
  • Admin
    Admin about 15 years
    Haven't, although I'm wishing we had. Maybe now that the fire is out, we can try something in more of a lab setting. I'm going to "answer" my question but it's not going to truly be the answer to the question.