Some Apache requests are slow, most complete instantly
Been banging my against the wall on this for a week now, and my boss has fixed it.
Once we looked at Apache's response times in the logs we saw that it was responding quickly - the delays were happening before the request even reached Apache. Thus he looked at the tcp stack settings, comparing them to another server running Red Hat 5.6.
To cut a long story short, enabling tcp syn cookies (net.ipv4.tcp_syncookies=1
in /etc/sysctl.conf) has fixed the problem. This setting is designed to protect against SYN floods and apparently does allow faster responses. It's possible we're getting flooded accidentally (or deliberately).
More info is in this link, the symptoms described are exactly what we were seeing: http://baheyeldin.com/technology/linux/detecting-and-preventing-syn-flood-attacks-web-servers-running-linux.html
I was looking at netstat -alnt
and the vast majority of connections were in state TIME_WAIT, not SYN_RECV (maybe the -l option doesn't show half-open connections).
However we are now seeing this in dmesg frequently:
possible SYN flooding on port 80. Sending cookies.
I shall do some more digging.
Related videos on Youtube
Alex Forbes
Updated on September 18, 2022Comments
-
Alex Forbes almost 2 years
I have two Dell R410 web servers (2x quad core Xeon E5520 w/ 8gb ram) running Debian 5 stable. Their patching had been neglected for a while, so recently we did a patching run to bring everything up to date - neccessitated by a new version of the application it runs which requires PHP 5.3.6. The kernel wasn't updated because it came from the Debian backports repository (the installed version is 2.6.30-bpo.1-amd64).
Since the patching, users have complained that the web site is slow. The majority of requests are served instantly, but now and again it'll get "stuck" on a request. There doesn't seem to be any discernible pattern in the requests that get stuck.
These servers are behind a load balancer, they were updated at the same time and both started exhibiting this issue at the time of the patching run. They were not rebooted at the time, but have been since with no effect.
I setup a script on the servers themselves to loop over
time curl localhost:80/alive
, which has a simple index.html file in it containing only "OK". Strangely these requests still get delayed with the same frequency and duration as requests for actual php content. The common times are 3 seconds, 9s, 25s 45s and some are over 3 minutes. 45 seconds is a common response time but of course browsers give up well before this so it's effectively no response.The apache worker config is as follows:
<IfModule mpm_prefork_module> StartServers 50 MinSpareServers 10 MaxSpareServers 150 ServerLimit 500 MaxClients 500 MaxRequestsPerChild 5000 </IfModule>
It seems sensible to me for a server with 8gb of ram. In practice the worker count seldom goes over 170 so we're not hitting that limit and there is plenty of free memory. Load averages are low, they hover around 0.5-1.5
The kernel is an old backport so I tried updating it to the latest backport for lenny (2.6.32-bpo.5-amd64), but it panicked on boot and I had to get our host to restart it with the old one, so I'd like to explore other options before we try updating their bioses and formatting them with Debian 6.
Apache seems to be a likely culprit, so the next step is to update to the latest apache backport, but the version was a fairly minor bump from 2.2.9-10+lenny4 to 2.2.9-10+lenny9, so I wasn't expecting any significant changes.
PHP is installed, version 5.3.6 from dotdeb. Previous version was 5.3.0 custom compiled. In addition, my boss has just informed me that requests over https do not get delayed but I have not confirmed this myself.
# apache2 -V Server version: Apache/2.2.9 (Debian) Server built: Dec 11 2010 21:34:00 Server's Module Magic Number: 20051115:15 Server loaded: APR 1.2.12, APR-Util 1.2.12 Compiled using: APR 1.2.12, APR-Util 1.2.12 Architecture: 64-bit Server MPM: Prefork threaded: no forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="" -D SUEXEC_BIN="/usr/lib/apache2/suexec" -D DEFAULT_PIDLOG="/var/run/apache2.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="/var/run/apache2/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types" -D SERVER_CONFIG_FILE="/etc/apache2/apache2.conf" # apache2ctl -t -D DUMP_MODULES Loaded Modules: core_module (static) log_config_module (static) logio_module (static) mpm_prefork_module (static) http_module (static) so_module (static) alias_module (shared) auth_basic_module (shared) authn_file_module (shared) authz_default_module (shared) authz_groupfile_module (shared) authz_host_module (shared) authz_user_module (shared) autoindex_module (shared) cgi_module (shared) deflate_module (shared) dir_module (shared) env_module (shared) geoip_module (shared) mime_module (shared) negotiation_module (shared) php5_module (shared) rewrite_module (shared) setenvif_module (shared) ssl_module (shared) status_module (shared) Syntax OK
Any assistance greatly appreciated!
-
Vinko Vrsalovic almost 13 yearsThat message means your SYN queue gets repeatedly full. That may be due to legitimate requests or not. If the requests are legitimate, you can increase the size of the queue with tcp_max_syn_backlog. See linuxinsight.com/proc_sys_net_ipv4_tcp_syncookies.html and redhat.com/archives/rhl-devel-list/2005-January/msg00447.html
-
Alex Forbes almost 13 yearsThanks! Yes, it could well be legitimate traffic so I've set net.ipv4.tcp_max_syn_backlog = 4096. Still seeing possible SYN flood messages though, one would think quadrupling it would be plenty for legitimate traffic.
-
Vinko Vrsalovic almost 13 yearsI would personally try to use a big number, like 16 or 64K to make sure it's not an attack (on an attack virtually every limit would get surpassed), and if it's not an attack, I would gradually diminish the queue size until a good small value is found)
-
Alex Forbes almost 13 yearsI've raised it to 65536 but am still getting possible SYN flooding on port 80. I've been watching the number of connections in SYN_RECV state (on a terminal running
watch --interval=5 'netstat -tuna |grep "SYN_RECV"|wc -l'
and it never goes higher than about 240. Yet I have a Red Hat server which hovers around 512 (limit on this server is the default of 1024). Do you now of any other settings which might impact the maximum size of the backlog? -
Vinko Vrsalovic almost 13 yearsNot sure, you should probably now open a followup question on how to tune / debug SYN queue issues
-
Tony almost 10 yearsI love you!! I really love you!