Python garbage collection

51,082

Solution 1

You haven't provided enough information - this depends on the specifics of the object you are creating and what else you're doing with it in the loop. If the object does not create circular references, it should be deallocated on the next iteration. For example, the code

for x in range(100000):
  obj = " " * 10000000

will not result in ever-increasing memory allocation.

Solution 2

I think this is circular reference (though the question isn't explicit about this information.)

One way to solve this problem is to manually invoke garbage collection. When you manually run garbage collector, it will sweep circular referenced objects too.

import gc

for i in xrange(10000):
    j = myObj()
    processObj(j)
    #assuming count reference is not zero but still
    #object won't remain usable after the iteration

    if !(i%100):
        gc.collect()

Here don't run garbage collector too often because it has its own overhead, e.g. if you run garbage collector in every loop, interpretation will become extremely slow.

Solution 3

This is an old error that was corrected for some types in python 2.5. What was happening was that python was not so good at collecting things like empty lists/dictionaries/tupes/floats/ints. In python 2.5 this was fixed...mostly. However floats and ints are singletons for comparisons so once one of those is created it stays around as long as the interpreter is alive. I've been bitten by this worst when dealing with large amount of floats since they have a nasty habit of being unique. This was characterized for python 2.4 and updated about it being folded into python 2.5

The best way I've found around it is to upgrade to python 2.5 or newer to take care of the lists/dictionaries/tuples issue. For numbers the only solution is to not let large amounts of numbers get into python. I've done it with my own wrapper to a c++ object, but I have the impression that numpy.array will give similar results.

As a post script I have no idea what has happened to this in python 3, but I'm suspicious that numbers are still part of a singleton. So the memory leak is actually a feature of the language.

Solution 4

If you're creating circular references, your objects won't be deallocated immediately, but have to wait for a GC cycle to run.

You could use the weakref module to address this problem, or explicitly del your objects after use.

Solution 5

I found that in my case (with Python 2.5.1), with circular references involving classes that have __del__() methods, not only was garbage collection not happening in a timely manner, the __del__() methods of my objects were never getting called, even when the script exited. So I used weakref to break the circular references and all was well.

Kudos to Miles who provided all the information in his comments for me to put this together.

Share:
51,082
utdiscant
Author by

utdiscant

Updated on July 09, 2022

Comments

  • utdiscant
    utdiscant almost 2 years

    I have created some python code which creates an object in a loop, and in every iteration overwrites this object with a new one of the same type. This is done 10.000 times, and Python takes up 7mb of memory every second until my 3gb RAM is used. Does anyone know of a way to remove the objects from memory?

  • utdiscant
    utdiscant almost 15 years
    I am creating circular references in my object. Can't it be deleted manually?
  • Miles
    Miles almost 15 years
    This shouldn't exactly be responsible for the problem as described, though; even Python 2.4 should reuse freed memory (it just didn't return it to the operating system).
  • Miles
    Miles almost 15 years
    Python will automatically collect objects with circular references, unless any of the objects in a reference cycle have del methods. If that's the case, garbage objects are moved to the gc.garbage list, and you will have to manually break the reference cycles. It's better to try to avoid having both del methods and reference cycles.
  • Miles
    Miles almost 15 years
    One solution to avoiding reference cycles is to use weakrefs: docs.python.org/library/weakref.html
  • Tyler
    Tyler almost 15 years
    In a typical program, it's easier and clearer to just let your variables go out of scope than explicitly deleting them. Of course, if your whole program is one single long function then nothing will go out of scope until it ends. Which is one reason that a single long function is not a recommended way of writing programs.
  • Erik Kaplun
    Erik Kaplun over 11 years
    Not sure if my experiment is correct but temporarily creating millions of floats definitely had constant memory usage. Putting the same floats in a list increased the memory usage 100MB/s. That was on 2.7... so I guess at least in 2.7 the problem doesn't exit? Am I missing something?
  • 0 _
    0 _ over 6 years
    If any object in a cycle has a __del__ method, then the cycle isn't garbage collected by gc. See stackoverflow.com/a/15974956/1959808