Load Balancing long running TCP connections

9,600

Solution 1

Assuming that all your back-end machines are "active" and able to respond to requests all you really need is a load balancer front-end.

A good load balancer will be able to keep track of the number of connections going to each host and dynamically distribute new connections to avoid swamping one of the back-end systems (by "good" I mean "expensive", like Cisco Content Switches/Content Switch Service Modules). Price goes hand in hand with features here: Content switches are pretty high up on the solutions tier.

I've got no experience with HAProxy, but it sounds like it can do least-connection load balancing like content switches, so this would probably be a good choice (and at a much more attractive price point). I'm not sure if HAProxy can do source-tracking (send all connections from the same IP to the same back-end) though.

A few steps down down the pf firewall (or the pfsense customized distribution) can do load balancing (random or round-robin, I don't believe they can do "weighted least connections" as a balancing option like the content switches). Source tracking is implemented in pf, though you may have to play with how long that information is retained to avoid problems with connections getting moved from one server to another.
If you're already using pf/pfsense as your firewall this is a no-cost option: We use this in my current deployment with good results, but our connections aren't as long-lived as yours.

Solution 2

Others have successfully implemented HAProxy and it even helps run the StackExchange sites. Other popular Web front ends are Nginx and pound. Ultimately, most of these solutions will be quite effective for most Web traffic.

If your goal is high availability and load balancing, sticky or persistent sessions are ill advised, as they reduce the effectiveness of both.

Without knowing more about your architecture or type of traffic, I would recommend LVS, which is my preferred solution. You refer to the network layer, which is where this load balancing solution is more focused. It is able to be used with most protocols and is not limited to Web traffic.

Share:
9,600

Related videos on Youtube

TJF
Author by

TJF

Updated on September 17, 2022

Comments

  • TJF
    TJF over 1 year

    I'm trying to research the best way to load balance long running TCP connections for the following scenario:

    We have multiple servers behind a redundant set of firewalls and clients establish long running (usually 10-15 hours) TCP connections to our backend servers.
    Right now, "load balancing" is handled via a client side round-robin approach to go through a list of IP addresses which are all homed at our firewalls and NAT'd accordingly to the backend servers.

    I'd like to get away from this approach and have only one public IP and have a separate load balancer that can check the health/load of the servers and distribute the incoming client connection requests accordingly.

    One problem here is that every client, establishes 3 socket connections on 3 different ports and I'd prefer if those connections were "sticky", so all those 3 connection requests are sent to the same backend server.

    I've been looking at e.g. HAProxy but I'm not really sure if it's really suited for my scenario. We have a relatively low connection count (~300 clients * 3 socket connections for each). Usually we see ~15KB/s continuous data transfer volume for each socket.

    Any input on this is greatly appreciated!

    Thanks,

    Tom

  • voretaq7
    voretaq7 over 13 years
    An additional advantage of content switches to consider: They can also handle SSL on the front-end -- Depending on your architecture and needs this can reduce the amount of work being done by your back-end servers and possibly save you some hardware.
  • TJF
    TJF over 13 years
    Thanks a lot for your reply! The firewalls we're using right now are actually Vyatta boxes, that's why I was looking at HAProxy as I could easily host it on those machines as well and handle automatic failover with heartbeat. I think HAProxy can handle source-tracking but we have a few clients that are connecting from the same office and public IP, so they would all end up at the server?
  • TJF
    TJF over 13 years
    Warner, as outlined in my original question, my traffic is non http traffic. It's traffic over TCP with pretty consistent data transfer rates (traffic can by spiky though). Connections are made rarely but remain established for multiple hours.
  • Warner
    Warner over 13 years
    Ah, I was thinking that your consideration of HAProxy was indicative of your traffic being Web based. I would encourage you to look at LVS first and everything else second for your solution, as all of the other solutions are largely intended for http based traffic.
  • Bob Aman
    Bob Aman over 8 years
    balance source is a thing in HAProxy. Hashes source IP and distributes load that way.
  • voretaq7
    voretaq7 over 8 years
    @BobAman Yup, 4+ years ago when I wrote this HAProxy wasn't as big a thing but if I were implementing this today I'd give preference to doing the load balancing in HAProxy (because of its other capabilities like leastconn to assign new clients to the servers with the fewest connections) and make pf or similar my second choice.