Apache problems in performance test with mod_reqtimeout

firewall apache-2.4 jboss ajp mod-reqtimeout

6,211

It is unusual that you are using "prefork" MPM in what seems to be essentially a reverse-proxy, the hybrid "worker" MPM has better scalability, that's not the problem though.

DoS mitigation is usually best handled (if not by your ISP) on the front-end device that terminates the client requests, from your description this is a load-balancer, not Apache. Any competent load balancer will be HTTP aware (i.e. "Layer 7"), this will likely result in buffering of HTTP requests. This also applies if the load-balancer is terminating HTTPS, but less so if the load-balancer is simply relaying the HTTPS connections (as it cannot see the HTTP requests to buffer them). Nor does it apply if your load-balancer is a simple "Layer 3/4" NAT type load-balancer.

As to the possible cause of the timeouts:

your SSL instance KeepAliveTimeout is the same as the RequestReadTimeout header read timeout, it's possible there's a race where keepalive timeout is reached mid-way through the incoming client request/headers. If I try to reproduce this, in addition to AH01382 errors I also get AH01991 (SSL input filter read failed)and AH00567 (request failed: error reading the headers). This may not explain all the problems though.
malformed client requests, not uncommon in the past (e.g. extra CR/LF after POST, incomplete requests when retrying after error). I don't know of anything current though, it depends on your client base, and more importantly, their connectivity.
there could be a bug similar to this recent one which caused spurious timeouts with "event" MPM.

To reproduce timeouts:

 $ openssl s_client -connect myhost:443
 GET / HTTP/1.1
 Host: myhost.whatever.com 

 [server reply goes here]
 GET / HTTP/1.1
 Host:

You can script this to make it easier, otherwise you must type/paste the first request and headers within your configured 10s, then type but not complete the second request within the next 10s, you must have at least one full line (the request) submitted for the second request, then just wait.

Decreasing the KeepAliveTimeout (default is 5 seconds) may help. Note that KeepAliveTimeout is the time to receive a complete a request. I think the next step may be mod_log_forensic.

Regarding the connections to the back-end via AJP, are you using "ping" on the Apache Balancer configuration? If I understand your system correctly, the Tomcat configuration you have given won't apply to connections from Apache httpd to Tomcat. See the options here.

                          /-> apache httpd + ajp -\            /-> tomcat/jboss
client -> load-balancer  <                         > firewall <
                          \-> apache httpd + ajp -/            \-> tomcat/jboss

6,211

Michael Niemand

Updated on September 18, 2022

Comments

Michael Niemand almost 2 years
We have 2 Apache webserver behind a load balancer wich are connected to 2 (JBoss) application servers via mod ajp.

To those webservers, mobile devices connect via a REST API.

In our performance test we rather quickly ran into a lot of NonHttpResponse: errors which we identified to be coming from mod_reqtimeout:
```
[Mon Mar 16 14:42:49.324705 2015] [reqtimeout:info] [pid 27914:tid 140628428449536] [client 1.2.3.4:48280] AH01382: Request header read timeout
```
... which is configured as follows:
```
<IfModule reqtimeout_module>
    RequestReadTimeout header=10-20,minrate=500
    RequestReadTimeout body=10,minrate=500
</IfModule>
```
I was able to get rid of those errors by increasing those values to
```
RequestReadTimeout header=20-60,minrate=100
```
But this can't be the solution, since with some more simultaneous users the problem occured again (There is a requirement to be able to serve 300 concurrent users - 100 worked quite ok, with 300 we had over 10,000 of those Request header read timeout errors). I suspect it's the interaction of apaches KeepAlive, our mod_ajp configuration and mod_reqtimeout that leads mod_reqtimeout to the conclusion that there is a slowloris attack ongoing (to many open connections that do nothing) and I kindly ask for your help in tweaking those parameters.

Additional problem is a firewall between webserver and application server, which I suspect to kill open idle connections. I read about deactivating KeepAlive completely to solve this, but as I said, all our clients are mobile devices, so that's probably not an option (?).

Here are the other configs (parts of):

workers.properties:
```
worker.list=server
worker.maintain=60

worker.server.type=ajp13
worker.server.host=server
worker.server.port=15869
worker.server.socket_keepalive=True
worker.server.connection_pool_timeout=600
worker.server.ping_mode=A
worker.server.connection_ping_interval=60
```
mod_prefork:
```
<IfModule prefork.c>
    StartServers         5
    MinSpareServers      5
    MaxSpareServers     10
    #MaxClients         256
    MaxClients         300
    MaxRequestsPerChild  0
</IfModule>
```
mainserver.conf:
```
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15
```
ssl.conf (mobile devices connect via ssl):
```
Timeout 1200
KeepAlive On
MaxKeepAliveRequests 0
KeepAliveTimeout 10
```
Michael Niemand about 9 years

Wow - thank you very much for your detailed answer! I haven't been able to implement your suggestions yet. I'll be back as soon as I do. Thanks again!!