Some Apache requests are slow, most complete instantly

15,504

Been banging my against the wall on this for a week now, and my boss has fixed it.

Once we looked at Apache's response times in the logs we saw that it was responding quickly - the delays were happening before the request even reached Apache. Thus he looked at the tcp stack settings, comparing them to another server running Red Hat 5.6.

To cut a long story short, enabling tcp syn cookies (net.ipv4.tcp_syncookies=1 in /etc/sysctl.conf) has fixed the problem. This setting is designed to protect against SYN floods and apparently does allow faster responses. It's possible we're getting flooded accidentally (or deliberately).

More info is in this link, the symptoms described are exactly what we were seeing: http://baheyeldin.com/technology/linux/detecting-and-preventing-syn-flood-attacks-web-servers-running-linux.html

I was looking at netstat -alnt and the vast majority of connections were in state TIME_WAIT, not SYN_RECV (maybe the -l option doesn't show half-open connections).

However we are now seeing this in dmesg frequently:

possible SYN flooding on port 80. Sending cookies.

I shall do some more digging.

Share:
15,504

Related videos on Youtube

Alex Forbes
Author by

Alex Forbes

Updated on September 18, 2022

Comments

  • Alex Forbes
    Alex Forbes almost 2 years

    I have two Dell R410 web servers (2x quad core Xeon E5520 w/ 8gb ram) running Debian 5 stable. Their patching had been neglected for a while, so recently we did a patching run to bring everything up to date - neccessitated by a new version of the application it runs which requires PHP 5.3.6. The kernel wasn't updated because it came from the Debian backports repository (the installed version is 2.6.30-bpo.1-amd64).

    Since the patching, users have complained that the web site is slow. The majority of requests are served instantly, but now and again it'll get "stuck" on a request. There doesn't seem to be any discernible pattern in the requests that get stuck.

    These servers are behind a load balancer, they were updated at the same time and both started exhibiting this issue at the time of the patching run. They were not rebooted at the time, but have been since with no effect.

    I setup a script on the servers themselves to loop over time curl localhost:80/alive, which has a simple index.html file in it containing only "OK". Strangely these requests still get delayed with the same frequency and duration as requests for actual php content. The common times are 3 seconds, 9s, 25s 45s and some are over 3 minutes. 45 seconds is a common response time but of course browsers give up well before this so it's effectively no response.

    The apache worker config is as follows:

    <IfModule mpm_prefork_module>
        StartServers        50
        MinSpareServers     10
        MaxSpareServers     150
        ServerLimit         500
        MaxClients          500
        MaxRequestsPerChild   5000
    </IfModule>
    

    It seems sensible to me for a server with 8gb of ram. In practice the worker count seldom goes over 170 so we're not hitting that limit and there is plenty of free memory. Load averages are low, they hover around 0.5-1.5

    The kernel is an old backport so I tried updating it to the latest backport for lenny (2.6.32-bpo.5-amd64), but it panicked on boot and I had to get our host to restart it with the old one, so I'd like to explore other options before we try updating their bioses and formatting them with Debian 6.

    Apache seems to be a likely culprit, so the next step is to update to the latest apache backport, but the version was a fairly minor bump from 2.2.9-10+lenny4 to 2.2.9-10+lenny9, so I wasn't expecting any significant changes.

    PHP is installed, version 5.3.6 from dotdeb. Previous version was 5.3.0 custom compiled. In addition, my boss has just informed me that requests over https do not get delayed but I have not confirmed this myself.

    # apache2 -V
    Server version: Apache/2.2.9 (Debian)
    Server built:   Dec 11 2010 21:34:00
    Server's Module Magic Number: 20051115:15
    Server loaded:  APR 1.2.12, APR-Util 1.2.12
    Compiled using: APR 1.2.12, APR-Util 1.2.12
    Architecture:   64-bit
    Server MPM:     Prefork
      threaded:     no
        forked:     yes (variable process count)
    Server compiled with....
     -D APACHE_MPM_DIR="server/mpm/prefork"
     -D APR_HAS_SENDFILE
     -D APR_HAS_MMAP
     -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
     -D APR_USE_SYSVSEM_SERIALIZE
     -D APR_USE_PTHREAD_SERIALIZE
     -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
     -D APR_HAS_OTHER_CHILD
     -D AP_HAVE_RELIABLE_PIPED_LOGS
     -D DYNAMIC_MODULE_LIMIT=128
     -D HTTPD_ROOT=""
     -D SUEXEC_BIN="/usr/lib/apache2/suexec"
     -D DEFAULT_PIDLOG="/var/run/apache2.pid"
     -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
     -D DEFAULT_LOCKFILE="/var/run/apache2/accept.lock"
     -D DEFAULT_ERRORLOG="logs/error_log"
     -D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types"
     -D SERVER_CONFIG_FILE="/etc/apache2/apache2.conf"
    
    # apache2ctl -t -D DUMP_MODULES
    Loaded Modules:
     core_module (static)
     log_config_module (static)
     logio_module (static)
     mpm_prefork_module (static)
     http_module (static)
     so_module (static)
     alias_module (shared)
     auth_basic_module (shared)
     authn_file_module (shared)
     authz_default_module (shared)
     authz_groupfile_module (shared)
     authz_host_module (shared)
     authz_user_module (shared)
     autoindex_module (shared)
     cgi_module (shared)
     deflate_module (shared)
     dir_module (shared)
     env_module (shared)
     geoip_module (shared)
     mime_module (shared)
     negotiation_module (shared)
     php5_module (shared)
     rewrite_module (shared)
     setenvif_module (shared)
     ssl_module (shared)
     status_module (shared)
    Syntax OK
    

    Any assistance greatly appreciated!

  • Vinko Vrsalovic
    Vinko Vrsalovic almost 13 years
    That message means your SYN queue gets repeatedly full. That may be due to legitimate requests or not. If the requests are legitimate, you can increase the size of the queue with tcp_max_syn_backlog. See linuxinsight.com/proc_sys_net_ipv4_tcp_syncookies.html and redhat.com/archives/rhl-devel-list/2005-January/msg00447.htm‌​l
  • Alex Forbes
    Alex Forbes almost 13 years
    Thanks! Yes, it could well be legitimate traffic so I've set net.ipv4.tcp_max_syn_backlog = 4096. Still seeing possible SYN flood messages though, one would think quadrupling it would be plenty for legitimate traffic.
  • Vinko Vrsalovic
    Vinko Vrsalovic almost 13 years
    I would personally try to use a big number, like 16 or 64K to make sure it's not an attack (on an attack virtually every limit would get surpassed), and if it's not an attack, I would gradually diminish the queue size until a good small value is found)
  • Alex Forbes
    Alex Forbes almost 13 years
    I've raised it to 65536 but am still getting possible SYN flooding on port 80. I've been watching the number of connections in SYN_RECV state (on a terminal running watch --interval=5 'netstat -tuna |grep "SYN_RECV"|wc -l' and it never goes higher than about 240. Yet I have a Red Hat server which hovers around 512 (limit on this server is the default of 1024). Do you now of any other settings which might impact the maximum size of the backlog?
  • Vinko Vrsalovic
    Vinko Vrsalovic almost 13 years
    Not sure, you should probably now open a followup question on how to tune / debug SYN queue issues
  • Tony
    Tony almost 10 years
    I love you!! I really love you!