how is it possible to configure uwsgi in emperor-mode with cgi-plugin + heartbeat in combination with nginx

5,604

Use 2.0.1 from github, it has better heartbeat code. Your problem is the little timedelta between emperor tolerance (30 seconds) and heartbeat frequency (25). For a bug in 2.0 the worker timeout got reset at every request. In addition to this 2.0.1 heartbeat is triggered as soon as the first worker is spawned leading to better results (broken apps are immediately detected). Regarding worker reload mercy, the right option is --worker-reload-mercy (--reload-mercy is for non-worker processes)

Share:
5,604

Related videos on Youtube

antiplex
Author by

antiplex

Updated on September 18, 2022

Comments

  • antiplex
    antiplex over 1 year

    After having the migration from fcgiwrap to uWSGI (which also brings plenty of other benefits that I'm planning to use) on my ToDo-list for some time I finally managed to setup a debian-wheezy test-system featuring uWSGI v2.0 and nginx v1.4.4.

    In the first step I would like to run .cgi-scripts through uWSGI's cgi-plugin reliably and with minimal overhead (rather weak hardware below) while having the option to easily enhance my configuration to also deploy apps through frameworks such as bottle/flask/django as a second step.

    Therefore I chose to make use of uWSGI's emperor mode, which is currently set up to control only one vassal which is configured to run the uWSGI-cgi-plugin on 2 workers with 2 threads each.

    After checking out various features the setup is more or less running ok now with 2 weird behaviours that I think are somehow wrong:

    • As soon as the vassal is configured to send heartbeats to the emperor (e.g. through adding heartbeat = 20 to the .ini), the emperor will repeatedly kill/respawn the vassal's master if the .cgi has not been run within the defined heartbeat-timespan. the number of workers configured doesn't seem to matter here.
    • The vassal ignores the option reload-mercy = 10 as it still logs your mercy for graceful operations on workers is 60 seconds (which is the default-value). ok, this is just a minor issue and of no big relevance to me.

    The reason for using the heartbeat-option is that i would like to assure the availability of my cgis/apps by using the built-in uWSGI mechanisms as good as possible.

    Any hints what i might be misunderstanding or doing wrong? I can't see any apparent reason why it should not be possible to use the heartbeat option in combination with the cgi-module in my configuration but would be grateful for any further insight! I assume nginx has nothing to do with the mentioned issues plus I also double-checked file and directory-perms... uWSGI is started through an init.d-script but the behaviour is the same when starting manually.

    My configuration is as follows:

    • section in nginx.conf:

      location ~ ^/cgi-bin/.*\..+$ {
         root            /usr/local/nginx/vhosts/testdomain.com/cgi-bin;
         gzip            off;
         include         uwsgi_params;
         uwsgi_modifier1 9;
         uwsgi_pass      unix:///var/run/nginx/testdomain_cgi-bin_uwsgi.sock;
       }
      
    • emperor.ini:

      [uwsgi]
      uid = www-data
      gid = www-data
      emperor = /etc/uwsgi/vassals
      emperor-pidfile = /var/run/uwsgi/emperor.pid
      daemonize = /var/log/uwsgi_emperor.log 
      
    • testdomain_cgi-bin.ini:

      [uwsgi]
      uid = www-data
      gid = www-data
      chdir = /usr/local/nginx/vhosts/testdomain/cgi-bin
      plugins = cgi
      cgi = /cgi-bin=/usr/local/nginx/vhosts/testdomain/cgi-bin
      cgi-allowed-ext = .cgi
      socket = /var/run/nginx/testdomain_cgi-bin_uwsgi.sock
      master = true
      #heartbeat = 25
      processes = 2
      threads = 2
      reload-mercy = 10
      no-orphans = true
      post-buffering = 4096
      max-requests = 2048
      vacuum = true
      logto = /usr/local/nginx/logs/uwsgi_testdomain_cgi-bin.log
      

    logs (when option heartbeat is enabled):

    • /var/log/uwsgi_emperor.log:

      *** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:35:36 2014] ***
      compiled with version: 4.7.2 on 31 January 2014 08:46:00
      os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
      nodename: testnode
      machine: i686
      clock source: unix
      pcre jit disabled
      detected number of CPU cores: 1
      current working directory: /
      *** running under screen session 1111.myscrn ***
      detected binary path: /usr/local/bin/uwsgi
      setgid() to 33
      setuid() to 33
      *** WARNING: you are running uWSGI without its master process manager ***
      your processes number limit is 3940
      your memory page size is 4096 bytes
      detected max file descriptor number: 1024
      writing pidfile to /var/run/uwsgi/emperor.pid
      *** starting uWSGI Emperor ***
      *** has_emperor mode detected (fd: 6) ***
      [uWSGI] getting INI configuration from testdomain_cgi-bin.ini
      Wed Feb  5 11:35:36 2014 - [emperor] vassal testdomain_cgi-bin.ini has been spawned
      Wed Feb  5 11:35:36 2014 - [emperor] vassal testdomain_cgi-bin.ini is ready to accept requests
      Wed Feb  5 11:35:43 2014 - [emperor] vassal testdomain_cgi-bin.ini is now loyal
      [emperor] vassal testdomain_cgi-bin.ini sent no heartbeat in last 30 seconds, brutally respawning it...
      Wed Feb  5 11:38:56 2014 - [emperor] removed uwsgi instance testdomain_cgi-bin.ini
      [emperor] unrecognized vassal event on fd 5
      [emperor] unrecognized vassal event on fd 5
          ... above lines repeaded for about another 50 times ... 
      *** has_emperor mode detected (fd: 6) ***
      [uWSGI] getting INI configuration from testdomain_cgi-bin.ini
      Wed Feb  5 11:38:56 2014 - [emperor] vassal testdomain_cgi-bin.ini has been spawned
      Wed Feb  5 11:38:56 2014 - [emperor] vassal testdomain_cgi-bin.ini is ready to accept requests
      
    • /usr/local/nginx/logs/uwsgi_testdomain_cgi-bin.log:

      *** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:35:36 2014] ***
      compiled with version: 4.7.2 on 31 January 2014 08:46:00
      os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
      nodename: testnode
      machine: i686
      clock source: unix
      pcre jit disabled
      detected number of CPU cores: 1
      current working directory: /etc/uwsgi/vassals
      *** running under screen session 1111.myscrn ***
      detected binary path: /usr/local/bin/uwsgi
      your processes number limit is 3940
      your memory page size is 4096 bytes
      detected max file descriptor number: 1024
      lock engine: pthread robust mutexes
      thunder lock: disabled (you can enable it with --thunder-lock)
      uwsgi socket 0 bound to UNIX address /var/run/nginx/testdomain_cgi-bin_uwsgi.sock fd 3
      your server socket listen backlog is limited to 100 connections
      your mercy for graceful operations on workers is 60 seconds
      mapped 175536 bytes (171 KB) for 2 cores
      *** Operational MODE: threaded ***
      initialized CGI mountpoint: /cgi-bin = /usr/local/nginx/vhosts/testdomain.com/cgi-bin
      *** no app loaded. going in full dynamic mode ***
      *** uWSGI is running in multiple interpreter mode ***
      spawned uWSGI master process (pid: 20825)
      spawned uWSGI worker 1 (pid: 20826, cores: 2)
      [pid: 20826|app: -1|req: -1/1] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:35:43 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 13 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      announcing my loyalty to the Emperor...
      [pid: 20826|app: -1|req: -1/2] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:35:54 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 2 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/3] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:04 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 5 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      [pid: 20826|app: -1|req: -1/4] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:16 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/5] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:28 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      [pid: 20826|app: -1|req: -1/6] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:39 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/7] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:36:51 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 5 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      [pid: 20826|app: -1|req: -1/8] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:03 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 2 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/9] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:15 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      [pid: 20826|app: -1|req: -1/10] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:27 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 6 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/11] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:39 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 12 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      [pid: 20826|app: -1|req: -1/12] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:37:51 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 4 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 1)
      [pid: 20826|app: -1|req: -1/13] XX.XX.XX.XXX () {42 vars in 678 bytes} [Wed Feb  5 11:38:03 2014] GET /cgi-bin/hellow.cgi => generated 282 bytes in 3 msecs (HTTP/1.1 200) 1 headers in 44 bytes (0 switches on core 0)
      Wed Feb  5 11:38:56 2014 - uWSGI worker 1 screams: UAAAAAAH my master disconnected: i will kill myself !!!
      *** Starting uWSGI 2.0 (32bit) on [Wed Feb  5 11:38:56 2014] ***
      compiled with version: 4.7.2 on 31 January 2014 08:46:00
      os: Linux-3.2.0-4-686-pae #1 SMP Debian 3.2.51-1
      nodename: testnode
      machine: i686
      clock source: unix
      pcre jit disabled
      detected number of CPU cores: 1
      current working directory: /etc/uwsgi/vassals
      *** running under screen session 1111.myscrn ***
      detected binary path: /usr/local/bin/uwsgi
      your processes number limit is 3940
      your memory page size is 4096 bytes
      detected max file descriptor number: 1024
      lock engine: pthread robust mutexes
      thunder lock: disabled (you can enable it with --thunder-lock)
      uwsgi socket 0 bound to UNIX address /var/run/nginx/testdomain_cgi-bin_uwsgi.sock fd 3
      your server socket listen backlog is limited to 100 connections
      your mercy for graceful operations on workers is 60 seconds
      mapped 175536 bytes (171 KB) for 2 cores
      *** Operational MODE: threaded ***
      initialized CGI mountpoint: /cgi-bin = /usr/local/nginx/vhosts/testdomain.com/cgi-bin
      *** no app loaded. going in full dynamic mode ***
      *** uWSGI is running in multiple interpreter mode ***
      spawned uWSGI master process (pid: 20881)
      spawned uWSGI worker 1 (pid: 20882, cores: 2)
      

    uWSGI doc for the vassal option heartbeat:

        Argument: number
    
        (Vassal option) Announce vassal health to the emperor every N seconds.
    

    uWSGI doc for the emperor option emperor-required-heartbeat:

        Argument: number Default: 30
    
        Set the Emperor tolerance about heartbeats.
    
        When a vassal asks for ‘heartbeat mode’ the emperor will 
        also expect a ‘heartbeat’ at least every <secs> seconds.
    
  • antiplex
    antiplex about 10 years
    thanks for the just-in-time-fix =) works now smoothly as expected! worker-reload-mercy option seems undocumented up to now but does the right thing; so would reload-mercy therefore only affect masters (or what other non-worker-processes are there)?
  • antiplex
    antiplex about 10 years
    oh and the option-doc for reload-mercy needs an overhaul as it explicitly mentions Set the maximum time (in seconds) a worker can take to reload/shutdown ... which might be ... rather misleading? ;)