how is it possible to configure uwsgi in emperor-mode with cgi-plugin + heartbeat in combination with nginx

nginx cgi uwsgi

5,604

Use 2.0.1 from github, it has better heartbeat code. Your problem is the little timedelta between emperor tolerance (30 seconds) and heartbeat frequency (25). For a bug in 2.0 the worker timeout got reset at every request. In addition to this 2.0.1 heartbeat is triggered as soon as the first worker is spawned leading to better results (broken apps are immediately detected). Regarding worker reload mercy, the right option is --worker-reload-mercy (--reload-mercy is for non-worker processes)

5,604

antiplex

Updated on September 18, 2022

Comments

antiplex over 1 year

After having the migration from fcgiwrap to uWSGI (which also brings plenty of other benefits that I'm planning to use) on my ToDo-list for some time I finally managed to setup a debian-wheezy test-system featuring uWSGI v2.0 and nginx v1.4.4.

In the first step I would like to run .cgi-scripts through uWSGI's cgi-plugin reliably and with minimal overhead (rather weak hardware below) while having the option to easily enhance my configuration to also deploy apps through frameworks such as bottle/flask/django as a second step.

Therefore I chose to make use of uWSGI's emperor mode, which is currently set up to control only one vassal which is configured to run the uWSGI-cgi-plugin on 2 workers with 2 threads each.

After checking out various features the setup is more or less running ok now with 2 weird behaviours that I think are somehow wrong:

As soon as the vassal is configured to send heartbeats to the emperor (e.g. through adding heartbeat = 20 to the .ini), the emperor will repeatedly kill/respawn the vassal's master if the .cgi has not been run within the defined heartbeat-timespan. the number of workers configured doesn't seem to matter here.
The vassal ignores the option reload-mercy = 10 as it still logs your mercy for graceful operations on workers is 60 seconds (which is the default-value). ok, this is just a minor issue and of no big relevance to me.

The reason for using the heartbeat-option is that i would like to assure the availability of my cgis/apps by using the built-in uWSGI mechanisms as good as possible.

Any hints what i might be misunderstanding or doing wrong? I can't see any apparent reason why it should not be possible to use the heartbeat option in combination with the cgi-module in my configuration but would be grateful for any further insight! I assume nginx has nothing to do with the mentioned issues plus I also double-checked file and directory-perms... uWSGI is started through an init.d-script but the behaviour is the same when starting manually.

My configuration is as follows:

section in nginx.conf:

location ~ ^/cgi-bin/.*\..+$ {
   root            /usr/local/nginx/vhosts/testdomain.com/cgi-bin;
   gzip            off;
   include         uwsgi_params;
   uwsgi_modifier1 9;
   uwsgi_pass      unix:///var/run/nginx/testdomain_cgi-bin_uwsgi.sock;
 }

emperor.ini:

[uwsgi]
uid = www-data
gid = www-data
emperor = /etc/uwsgi/vassals
emperor-pidfile = /var/run/uwsgi/emperor.pid
daemonize = /var/log/uwsgi_emperor.log

testdomain_cgi-bin.ini:

[uwsgi]
uid = www-data
gid = www-data
chdir = /usr/local/nginx/vhosts/testdomain/cgi-bin
plugins = cgi
cgi = /cgi-bin=/usr/local/nginx/vhosts/testdomain/cgi-bin
cgi-allowed-ext = .cgi
socket = /var/run/nginx/testdomain_cgi-bin_uwsgi.sock
master = true
#heartbeat = 25
processes = 2
threads = 2
reload-mercy = 10
no-orphans = true
post-buffering = 4096
max-requests = 2048
vacuum = true
logto = /usr/local/nginx/logs/uwsgi_testdomain_cgi-bin.log

logs (when option heartbeat is enabled):

/var/log/uwsgi_emperor.log:

*** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:35:36 2014] ***
compiled with version: 4.7.2 on 31 January 2014 08:46:00
os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
nodename: testnode
machine: i686
clock source: unix
pcre jit disabled
detected number of CPU cores: 1
current working directory: /
*** running under screen session 1111.myscrn ***
detected binary path: /usr/local/bin/uwsgi
setgid() to 33
setuid() to 33
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 3940
your memory page size is 4096 bytes
detected max file descriptor number: 1024
writing pidfile to /var/run/uwsgi/emperor.pid
*** starting uWSGI Emperor ***
*** has_emperor mode detected (fd: 6) ***
[uWSGI] getting INI configuration from testdomain_cgi-bin.ini
Wed Feb  5 11:35:36 2014 - [emperor] vassal testdomain_cgi-bin.ini has been spawned
Wed Feb  5 11:35:36 2014 - [emperor] vassal testdomain_cgi-bin.ini is ready to accept requests
Wed Feb  5 11:35:43 2014 - [emperor] vassal testdomain_cgi-bin.ini is now loyal
[emperor] vassal testdomain_cgi-bin.ini sent no heartbeat in last 30 seconds, brutally respawning it...
Wed Feb  5 11:38:56 2014 - [emperor] removed uwsgi instance testdomain_cgi-bin.ini
[emperor] unrecognized vassal event on fd 5
[emperor] unrecognized vassal event on fd 5
    ... above lines repeaded for about another 50 times ... 
*** has_emperor mode detected (fd: 6) ***
[uWSGI] getting INI configuration from testdomain_cgi-bin.ini
Wed Feb  5 11:38:56 2014 - [emperor] vassal testdomain_cgi-bin.ini has been spawned
Wed Feb  5 11:38:56 2014 - [emperor] vassal testdomain_cgi-bin.ini is ready to accept requests

/usr/local/nginx/logs/uwsgi_testdomain_cgi-bin.log:

*** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:35:36 2014] ***
compiled with version: 4.7.2 on 31 January 2014 08:46:00
os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
nodename: testnode
machine: i686
clock source: unix
pcre jit disabled
detected number of CPU cores: 1
current working directory: /etc/uwsgi/vassals
*** running under screen session 1111.myscrn ***
detected binary path: /usr/local/bin/uwsgi
your processes number limit is 3940
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /var/run/nginx/testdomain_cgi-bin_uwsgi.sock fd 3
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 175536 bytes (171 KB) for 2 cores
*** Operational MODE: threaded ***
initialized CGI mountpoint: /cgi-bin = /usr/local/nginx/vhosts/testdomain.com/cgi-bin
*** no app loaded. going in full dynamic mode ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 20825)
spawned uWSGI worker 1 (pid: 20826, cores: 2)
[pid: 20826|app: -1|req: -1/1] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:35:43 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 13 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
announcing my loyalty to the Emperor...
[pid: 20826|app: -1|req: -1/2] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:35:54 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 2 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/3] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:04 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 5 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
[pid: 20826|app: -1|req: -1/4] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:16 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/5] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:28 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
[pid: 20826|app: -1|req: -1/6] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:39 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/7] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:51 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 5 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
[pid: 20826|app: -1|req: -1/8] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:03 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 2 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/9] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:15 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
[pid: 20826|app: -1|req: -1/10] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:27 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 6 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/11] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:39 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 12 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
[pid: 20826|app: -1|req: -1/12] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:51 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 4 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
[pid: 20826|app: -1|req: -1/13] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:38:03 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
Wed Feb  5 11:38:56 2014 - uWSGI worker 1 screams: UAAAAAAH my master disconnected: i will kill myself !!!
*** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:38:56 2014] ***
compiled with version: 4.7.2 on 31 January 2014 08:46:00
os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
nodename: testnode
machine: i686
clock source: unix
pcre jit disabled
detected number of CPU cores: 1
current working directory: /etc/uwsgi/vassals
*** running under screen session 1111.myscrn ***
detected binary path: /usr/local/bin/uwsgi
your processes number limit is 3940
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /var/run/nginx/testdomain_cgi-bin_uwsgi.sock fd 3
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 175536 bytes (171 KB) for 2 cores
*** Operational MODE: threaded ***
initialized CGI mountpoint: /cgi-bin = /usr/local/nginx/vhosts/testdomain.com/cgi-bin
*** no app loaded. going in full dynamic mode ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 20881)
spawned uWSGI worker 1 (pid: 20882, cores: 2)

uWSGI doc for the vassal option heartbeat:

    Argument: number

    (Vassal option) Announce vassal health to the emperor every N seconds.

uWSGI doc for the emperor option emperor-required-heartbeat:

    Argument: number Default: 30

    Set the Emperor tolerance about heartbeats.

    When a vassal asks for ‘heartbeat mode’ the emperor will 
    also expect a ‘heartbeat’ at least every <secs> seconds.

antiplex about 10 years

thanks for the just-in-time-fix =) works now smoothly as expected! worker-reload-mercy option seems undocumented up to now but does the right thing; so would reload-mercy therefore only affect masters (or what other non-worker-processes are there)?
antiplex about 10 years

oh and the option-doc for reload-mercy needs an overhaul as it explicitly mentions Set the maximum time (in seconds) a worker can take to reload/shutdown ... which might be ... rather misleading? ;)