How to gracefully restart django running fcgi behind nginx?

13,492

Solution 1

I would start a new fcgi process on a new port, change the nginx configuration to use the new port, have nginx reload configuration (which in itself is graceful), then eventually stop the old process (you can use netstat to find out when the last connection to the old port is closed).

Alternatively, you can change the fcgi implementation to fork a new process, close all sockets in the child except for the fcgi server socket, close the fcgi server socket in parent, exec a new django process in the child (making it use the fcgi server socket), and terminate the parent process once all fcgi connections are closed. IOW, implement graceful restart for runfcgi.

Solution 2

So I went ahead and implemented Martin's suggestion. Here is the bash script I came up with.

pid_file=/path/to/pidfile
port_file=/path/to/port_file
old_pid=`cat $pid_file`

if [[ -f $port_file ]]; then
    last_port=`cat $port_file`
    port_to_use=$(($last_port + 1))
else
    port_to_use=8000
fi

# Reset so me don't go up forever
if [[ $port_to_use -gt 8999 ]]; then
    port_to_use=8000
fi

sed -i "s/$old_port/$port_to_use/g" /path/to/nginx.conf

python manage.py runfcgi host=127.0.0.1 port=$port_to_use maxchildren=5 maxspare=5 minspare=2 method=prefork pidfile=$pid_file

echo $port_to_use > $port_file

kill -HUP `cat /var/run/nginx.pid`

echo "Sleeping for 5 seconds"
sleep 5s

echo "Killing old processes on $last_port, pid $old_pid"
kill $old_pid

Solution 3

I came across this page while looking for a solution for this problem. Everything else failed, so I looked in to the source code :)

The solution seems to be much simpler. Django fcgi server uses flup, which handles the HUP signal the proper way: it shuts down, gracefully. So all you have to do is to:

  1. send the HUP signal to the fcgi server (the pidfile= argument of runserver will come in handy)

  2. wait a bit (flup allows children processes 10 seconds, so wait a couple more; 15 looks like a good number)

  3. sent the KILL signal to the fcgi server, just in case something blocked it

  4. start the server again

That's it.

Solution 4

You can use spawning instead of FastCGI

http://www.eflorenzano.com/blog/post/spawning-django/

Solution 5

We finally found the proper solution to this!

http://rambleon.usebox.net/post/3279121000/how-to-gracefully-restart-django-running-fastcgi

First send flup a HUP signal to signal a restart. Flup will then do this to all of its children:

  1. closes the socket which will stop inactive children
  2. sends a INT signal
  3. waits 10 seconds
  4. sends a KILL signal

When all the children are gone it will start new ones.

This works almost all of the time, except that if a child is handling a request when flup executes step 2 then your server will die with KeyboardInterrupt, giving the user a 500 error.

The solution is to install a SIGINT handler - see the page above for details. Even just ignoring SIGINT gives your process 10 seconds to exit which is enough for most requests.

Share:
13,492

Related videos on Youtube

evo
Author by

evo

Updated on August 12, 2020

Comments

  • evo
    evo almost 4 years

    I'm running a django instance behind nginx connected using fcgi (by using the manage.py runfcgi command). Since the code is loaded into memory I can't reload new code without killing and restarting the django fcgi processes, thus interrupting the live website. The restarting itself is very fast. But by killing the fcgi processes first some users' actions will get interrupted which is not good. I'm wondering how can I reload new code without ever causing any interruption. Advices will be highly appreciated!

  • nullException
    nullException over 15 years
    if you put a new fcgi on a new port, wouldn't nginx forward already-logged users to the new process? it would be the same as cold-restarting the fcgi process
  • Martin v. Löwis
    Martin v. Löwis over 15 years
    It would indeed forward all users to the new process; that does no harm. The problem with cold-restarting is that the running process is killed, so in-progress HTTP requests fail. This is the case the OP worries about (IIUC)
  • David Eyk
    David Eyk over 13 years
    This works really well if you set up your django fcgi server under upstart. initctl reload <job> will send the HUP, and the respawn directive in your job definition will handle the restart. No muss, no fuss.
  • raylu
    raylu over 12 years
    As noted in the comments on that post, this only works for flup 1.0.3. Also, I couldn't get this working with prefork, only threaded.
  • gingerlime
    gingerlime over 12 years
    Is there a problem or particular reason not to use flup 1.0.3? I'm using it with prefork mode and it works fine.
  • srchulo
    srchulo almost 11 years
    wow, simple yet incredibly useful. Thanks so much for sharing this!
  • Elvorfirilmathredia
    Elvorfirilmathredia about 4 years
    I think there is a typo in the sed part of the script, $old_port should be $last_port I guess.