Reducing Django Memory Usage. Low hanging fruit?

20,037

Solution 1

Make sure you are not keeping global references to data. That prevents the python garbage collector from releasing the memory.

Don't use mod_python. It loads an interpreter inside apache. If you need to use apache, use mod_wsgi instead. It is not tricky to switch. It is very easy. mod_wsgi is way easier to configure for django than brain-dead mod_python.

If you can remove apache from your requirements, that would be even better to your memory. spawning seems to be the new fast scalable way to run python web applications.

EDIT: I don't see how switching to mod_wsgi could be "tricky". It should be a very easy task. Please elaborate on the problem you are having with the switch.

Solution 2

If you are running under mod_wsgi, and presumably spawning since it is WSGI compliant, you can use Dozer to look at your memory usage.

Under mod_wsgi just add this at the bottom of your WSGI script:

from dozer import Dozer
application = Dozer(application)

Then point your browser at http://domain/_dozer/index to see a list of all your memory allocations.

I'll also just add my voice of support for mod_wsgi. It makes a world of difference in terms of performance and memory usage over mod_python. Graham Dumpleton's support for mod_wsgi is outstanding, both in terms of active development and in helping people on the mailing list to optimize their installations. David Cramer at curse.com has posted some charts (which I can't seem to find now unfortunately) showing the drastic reduction in cpu and memory usage after they switched to mod_wsgi on that high traffic site. Several of the django devs have switched. Seriously, it's a no-brainer :)

Solution 3

These are the Python memory profiler solutions I'm aware of (not Django related):

Disclaimer: I have a stake in the latter.

The individual project's documentation should give you an idea of how to use these tools to analyze memory behavior of Python applications.

The following is a nice "war story" that also gives some helpful pointers:

Solution 4

Additionally, check if you do not use any of known leakers. MySQLdb is known to leak enormous amounts of memory with Django due to bug in unicode handling. Other than that, Django Debug Toolbar might help you to track the hogs.

Solution 5

In addition to not keeping around global references to large data objects, try to avoid loading large datasets into memory at all wherever possible.

Switch to mod_wsgi in daemon mode, and use Apache's worker mpm instead of prefork. This latter step can allow you to serve many more concurrent users with much less memory overhead.

Share:
20,037
Andy Baker
Author by

Andy Baker

I'm a web developer and designer with an interest in user experience and a fondness for Django and JQuery. Twitter: @andybak

Updated on May 28, 2020

Comments

  • Andy Baker
    Andy Baker about 4 years

    My memory usage increases over time and restarting Django is not kind to users.

    I am unsure how to go about profiling the memory usage but some tips on how to start measuring would be useful.

    I have a feeling that there are some simple steps that could produce big gains. Ensuring 'debug' is set to 'False' is an obvious biggie.

    Can anyone suggest others? How much improvement would caching on low-traffic sites?

    In this case I'm running under Apache 2.x with mod_python. I've heard mod_wsgi is a bit leaner but it would be tricky to switch at this stage unless I know the gains would be significant.

    Edit: Thanks for the tips so far. Any suggestions how to discover what's using up the memory? Are there any guides to Python memory profiling?

    Also as mentioned there's a few things that will make it tricky to switch to mod_wsgi so I'd like to have some idea of the gains I could expect before ploughing forwards in that direction.

    Edit: Carl posted a slightly more detailed reply here that is worth reading: Django Deployment: Cutting Apache's Overhead

    Edit: Graham Dumpleton's article is the best I've found on the MPM and mod_wsgi related stuff. I am rather disappointed that no-one could provide any info on debugging the memory usage in the app itself though.

    Final Edit: Well I have been discussing this with Webfaction to see if they could assist with recompiling Apache and this is their word on the matter:

    "I really don't think that you will get much of a benefit by switching to an MPM Worker + mod_wsgi setup. I estimate that you might be able to save around 20MB, but probably not much more than that."

    So! This brings me back to my original question (which I am still none the wiser about). How does one go about identifying where the problems lies? It's a well known maxim that you don't optimize without testing to see where you need to optimize but there is very little in the way of tutorials on measuring Python memory usage and none at all specific to Django.

    Thanks for everyone's assistance but I think this question is still open!

    Another final edit ;-)

    I asked this on the django-users list and got some very helpful replies

    Honestly the last update ever!

    This was just released. Could be the best solution yet: Profiling Django object size and memory usage with Pympler

  • Josh Smeaton
    Josh Smeaton over 15 years
    Django developers advocate using mod_python don't they? What's wrong with using Apache?
  • Carl Meyer
    Carl Meyer over 15 years
    Django still endorses mod_python because mod_wsgi is still fairly new, and they want to be conservative. But if you follow the Django community you'll see people switching to mod_wsgi en masse. It won't take long before it's the recommended option.
  • Tiago
    Tiago over 15 years
    @nosklo: What would be an apache-only feature, for example? I'm finishing my app, and I'm starting to worry about deploying... and, as I never deployed django before, I studying all the possibilities.
  • nosklo
    nosklo over 15 years
    @Tiago: apache is good when you have a lot of apache virtual hosts already in place, using SSL with apache already, etc. In this case, use mod_wsgi. If you are starting afresh, use spawning. NEVER use mod_python.
  • Tiago
    Tiago over 15 years
    Thanks, nosklo. I'm taking a look at spawning.. seems to have little to none documentation.. I'll try to follow some instructions I found in blog posts and see where I can get.
  • nosklo
    nosklo over 15 years
    @Tiago: maybe ask a new top-level question about it - it is not hard to setup.
  • nosklo
    nosklo over 15 years
    @andybak: if your auth system is written as django middleware, or called inside a django urls.py I don't see why it wouldn't work with mod_wsgi.
  • Andy Baker
    Andy Baker over 15 years
    @nosklo It's static files I want to control so requests won't even see Django unless I tie in with the webservers own auth mechanism (and it has to be cookie based as a HTTP auth dialog isn't very pretty for my users)
  • Andy Baker
    Andy Baker over 15 years
    Also see Carl's answer here: stackoverflow.com/questions/488864/…
  • Powerlord
    Powerlord over 15 years
    Hmm, as someone just starting to use Django, I'll keep in mind that I should use mod_wsgi.
  • Chris J.
    Chris J. over 15 years
    Don't use mod_python or mod_wsgi. Use standalone django servers with a lean front-end like nginx or lighttpd and FastCGI proxying :)
  • Andy Baker
    Andy Baker over 15 years
    Thanks, I had already read those. It's numbers 3 and 6 I was hoping for a bit more detail on! ;-)
  • Carl Meyer
    Carl Meyer over 15 years
    @nezroy: that's a highly debatable recommendation. I would say that WSGI, an actively-developed standard with a booming ecosystem of middleware etc, has much more of a future in Python web application deployment than FastCGI.
  • Andy Baker
    Andy Baker over 15 years
    In which case I will soon be posting a question asking how one gets cookie based authentication for django users accessing static files...
  • nosklo
    nosklo over 15 years
    @andybak: then use mod_python for your apache auth extension module, and mod_wsgi to run your django application.
  • Andy Baker
    Andy Baker over 15 years
    Also - in a few posts I've read it seems that the real gain is in switching to worker MPM rather than the use of mod_wsgi...
  • msanders
    msanders about 15 years
    amix.dk/blog/viewEntry/19420 shows dozer being used to show that MySQLdb was leaking memory. MySQLdb 1.2.3c1 and later fixes this.
  • Tomas Andrle
    Tomas Andrle over 14 years
    Something similar is mentioned here: mail-archive.com/[email protected]/msg84698.html only they used inactivity-timeout instead of maximum-requests.
  • Wtower
    Wtower over 8 years
    How could django-debug-toolbar help?