HAProxy stops accepting connections

7,483

I just ran into what I think is the same issue or a similar issue. I was running haproxy and suddenly it would stop accepting new connections. I tried bumping up the max connections from 32k to 65k, and when I restarted it looked better, but then would stop accepting new connections way before the max connections limit in the haproxy.cfg file. I could see the change taking effect in the haproxy stats webpage.

I then looked in /proc//limits and saw this: ... Max open files 4026 4026 files ...

I then checked in /proc//fd and saw that when haproxy stopped I had about that number of files open. SO I think the problem is the underlying unix limits not haproxy. I increased the process limits and so far in my tests it looks better.

I hope that helps if your issue is being caused by the same underlying issue.

Share:
7,483

Related videos on Youtube

erikcw
Author by

erikcw

Updated on September 18, 2022

Comments

  • erikcw
    erikcw almost 2 years

    I've been using HAProxy to loadbalance my appservers for many months without problems. Recently some traffic spikes have lead me to setting the maxconn parameter to rate-limit connections to my backend servers.

    It works great for several hours, and then it seems to stop accepting any connections.

    I've looked at my system resource graphs, and both loadavg and RAM usage seem to be under control with no big spikes before the freeze. EDIT: Loadavg does seem to spike from about 0.13 to 1.0 right before the freeze. There are 4 cores on this system, I'm only running HAProxy as a single process -- when I view it in htop after the freeze, that CPU is pegged at 100%.

    # HA-Proxy version 1.4.16 2011/08/04
    global  
            log 127.0.0.1   local0
            log 127.0.0.1   local1 notice
            #log 127.0.0.1   local1 debug
            #log loghost    local0 info
            maxconn 4096
            #chroot /usr/share/haproxy
            user haproxy
            group haproxy
            daemon
            #debug
            #quiet
    
    defaults
            log     global
            mode    http
            option  httplog
            option  dontlognull
            retries 3
            option forwardfor
            option redispatch
            maxconn 2000
            contimeout      5000
            clitimeout      50000
            srvtimeout      50000
    
    listen  platform-cache 0.0.0.0:80
            mode http
            maxconn 10000
            option abortonclose
            #balance uri
            hash-type consistent
            balance hdr(Host)
            server varnish1 10.176.129.245 weight 20 maxconn 64 check
            server varnish2 10.176.129.29 weight 40  maxconn 64 check
            contimeout 60000
    
    # HTTP response : 'HTTP/1.0 200 OK'
    listen http_health_check 0.0.0.0:60001
            mode health
            option httpchk
    
    listen  stats_for_scout 127.0.0.1:8081
            mode http
            stats uri /stats
    
    listen  public_stats :8080
            mode http
            stats uri /stats
            stats realm   Haproxy\ Statistics
            stats auth    ****************
    

    Here are the logs from one restart to the next freeze and restart a few hours laters:

    Feb  5 08:57:15 localhost haproxy[31950]: Proxy platform-cache started.
    Feb  5 08:57:15 localhost haproxy[31950]: Proxy http_health_check started.Feb  5 08:57:15 localhost haproxy[31950]: Proxy stats_for_scout started.
    Feb  5 08:57:15 localhost haproxy[31950]: Proxy public_stats started.
    Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy platform-cache.
    Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy http_health_check.
    Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy stats_for_scout.
    Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy public_stats.
    Feb  5 09:43:47 localhost haproxy[32746]: Proxy platform-cache started.
    Feb  5 09:43:47 localhost haproxy[32746]: Proxy http_health_check started.
    Feb  5 09:43:47 localhost haproxy[32746]: Proxy stats_for_scout started.
    Feb  5 09:43:47 localhost haproxy[32746]: Proxy public_stats started.
    Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy platform-cache in 0 ms.
    Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy http_health_check in 0 ms.
    Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy stats_for_scout in 0 ms.
    Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy public_stats in 0 ms.
    Feb  5 09:43:47 localhost haproxy[31951]: Proxy platform-cache stopped (FE: 32540 conns, BE: 30334 conns).
    Feb  5 09:43:47 localhost haproxy[31951]: Proxy http_health_check stopped (FE: 0 conns, BE: 0 conns).
    Feb  5 09:43:47 localhost haproxy[31951]: Proxy stats_for_scout stopped (FE: 16 conns, BE: 0 conns).
    Feb  5 09:43:47 localhost haproxy[31951]: Proxy public_stats stopped (FE: 4 conns, BE: 2 conns).
    Feb  5 17:52:27 localhost haproxy[26610]: Proxy platform-cache started.
    Feb  5 17:52:27 localhost haproxy[26610]: Proxy http_health_check started.
    Feb  5 17:52:27 localhost haproxy[26610]: Proxy stats_for_scout started.
    Feb  5 17:52:27 localhost haproxy[26610]: Proxy public_stats started.
    ~                                                                            
    

    Original configuration that has worked flawlessly for over a year

    global 
            log 127.0.0.1   local0
            log 127.0.0.1   local1 notice
            #log 127.0.0.1   local1 debug
            #log loghost    local0 info
            maxconn 4096
            #chroot /usr/share/haproxy
            user haproxy
            group haproxy
            daemon
            #debug
            #quiet
    
    defaults
            log     global
            mode    http
            option  httplog
            option  dontlognull
            retries 3
            option forwardfor
            option redispatch
            maxconn 2000
            contimeout      5000
            clitimeout      50000
            srvtimeout      50000
    
    listen  platform-cache 0.0.0.0:80
            mode http
            #balance uri
            hash-type consistent
            balance hdr(Host)
            server varnish1 10.176.129.245 weight 20 check
            server varnish2 10.176.129.29 weight 40 check
    
    # HTTP response : 'HTTP/1.0 200 OK'
    listen http_health_check 0.0.0.0:60001
            mode health
            option httpchk
    
    listen  stats_for_scout 127.0.0.1:8081
            mode http
            stats uri /stats
    
    listen  public_stats :8080
            mode http
            stats uri /stats
            stats realm   Haproxy\ Statistics
            stats auth    *******
    

    Thanks for your help!

    • GioMac
      GioMac almost 11 years
      If you unset maxconn parameter, does it work? Maybe, problem isn't in HAproxy.
  • erikcw
    erikcw over 12 years
    According to my monitoring system, HAProxy completely stops accepting connections -- for hours if I leave it (I mean zero connections to the backend servers). It also looks like the number of connections is no where near the maxconn settings when it happens. The only thing that seems to get the system to deliver any requests is restart HAProxy.
  • Khaled
    Khaled over 12 years
    Do you get the same behaviour when you comment the maxconn settings?
  • erikcw
    erikcw over 12 years
    I was running without maxconn for about a year without any problems. So yes, it does seem tied to maxconn in someway. I'll add my original configuration to my question for reference.