HAproxy roundrobin balancing does not appear to be distributing evenly

7,086

You should definitely observe the load you're expecting. I suspect that some of your servers are regularly failing to respond to health checks and that they're regularly pulled out of the farm then reinserted once their load drops and they can respond again.

Also, how do you observe the unbalance ? The logs only log warnings (basically only downs).

Could you also check the stats page, you'll see how many sessions are sent to each server, how many times they're up/down, and if there has been some queuing or at least if they have reached their maxconn.

7,086

andrew

I surf, skate, and program.

Updated on September 17, 2022

Comments

andrew over 1 year

I know that with loaded servers, roundrobin in HAproxy (1.4.4) does not evenly distribute, but my servers are currently getting NO traffic (test setup), and roundrobin balancing does www1,www1,www1,www1,www1,...www2,www2,www2,...,www1...

I'm verifying this by having the script being run on each server cat /etc/HOSTNAME (slackware). I need to have it switch back and forth each time to test some session stuff (stored in shared memcached) but am having trouble getting it to switch between my two web servers on each request.

global
    log     127.0.0.1 local0 warning
    maxconn     4096
    chroot      /usr/share/haproxy
    pidfile     /var/run/haproxy.pid
    uid     99
    gid     99
    daemon

defaults
    balance     roundrobin
    fullconn    100
    maxconn     4096
    mode        http
    option      dontlognull
    option      http-server-close
    option      forwardfor
    option      redispatch
    retries     3
    timeout     connect 5000
    timeout     client  20000
    timeout     server  60000
    timeout     queue   60000
    stats       enable
    stats       uri /haproxy
    stats       auth ***:***

frontend www *:80
    log     global
    acl     is_upload hdr_dom(host) -i uploads.site.com
    acl     is_api hdr_dom(host) -i api.site.com
    acl     is_dev hdr_dom(host) -i dev.site.com
    use_backend uploads.site.com if is_upload
    use_backend api.site.com if is_api
    use_backend dev.site.com if is_dev
    default_backend site.com

backend site.com
    option  httpchk HEAD /alive.php HTTP/1.1\r\nHost:site.com
    server  www1 1.1.1.1:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2
    server  www2 1.1.1.2:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2

backend api.site.com
    option  httpchk HEAD /alive.php HTTP/1.1\r\nHost:api.site.com
    server  www1 1.1.1.1:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2
    server  www2 1.1.1.2:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2

backend dev.site.com
    option  httpchk HEAD /alive.php HTTP/1.1\r\nHost:dev.site.com
    server  www1 1.1.1.1:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2
    server  www2 1.1.1.2:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2

backend uploads.site.com
    option  httpchk HEAD /alive.php HTTP/1.1\r\nHost:uploads.site.com
    server  www1 1.1.1.1:8080 weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2
    server  www2 1.1.1.2:8080 backup weight 10 minconn 5 maxconn 25 check inter 2000 rise 2 fall 2

So basically, I have some different back-ends (I've verified the ACLs are working), with the default option "roundrobin" selected. I've tried removing weights, removing the minconn/maxconn/fullconn attributes for all servers (not just the backend I'm testing), tried removing the ACLs, etc. I've been testing on dev.site.com BTW.

Anyone see a reason why I can't get something like www1,www2,www1,www2,...? Also, this is one of my first questions on here, so please let me know if I left anything needed out of my post.

Thanks!

andrew almost 14 years

Willy, thanks for the response. I think I've narrowed this down. When I was running my tests, the servers in the farm (www1, www2) were always up. I monitored the stats page very closely. I just observed that when I click a link on the page that refreshes it, haproxy does distribute the load correctly (www1,www2,www1,www2...) but when I hit the browser refresh button, it sticks to one server. The only difference is that when I hit refresh: "Cache-Control: max-age=0" is sent. Oddly enough, when the refresh is hit, HAproxy seems to send data to both servers (bytes in/out). Any ideas?
Willy Tarreau almost 14 years

It's just because you have multiple objects on the page you refresh, and the requests are half of them are fetched from one server, and the other half from the other server. If you enable cookie-based persistence, you'll see that once you go to a server, you stick to it as long as it remains up.
andrew almost 14 years

That explains it...can't believe I forgot about elements. I also tried to figure out why it was doing it on pages with no elements...turns out it was a request for favicon.ico =). Thanks a bunch.