Is there a way to avoid this memory error?

13,978

Solution 1

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.

When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?

The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.

I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:

def generator_sol(anums1, s):
      for comb in itertools.combinations(anums1, s):
        yield comb

Now instead of pairs = combinations(anums, 2) which generates a large object. Use:

pairs = generator_sol(anums, 2)

Then, instead of using the lambda I would use another generator:

sums_sol = (x[0]+x[1] for x in pairs)

Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).

Solution 2

Let me suggest you to use generators. Try changing this:

sums = map(lambda x: x[0]+x[1], pairs)

to

sums = (a+b for (a,b) in pairs)

Ofiris solution is also ok but implies that itertools.combinations return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.

Solution 3

The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:

import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()
Share:
13,978
Jesse Neff
Author by

Jesse Neff

Updated on October 24, 2022

Comments

  • Jesse Neff
    Jesse Neff over 1 year

    I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.

    from itertools import combinations
    import time
    
    def findanums(n):
        l = []
        for i in range(1, n + 1):
            s = []
            for j in range(1, i):
                if i % j == 0:
                    s.append(j)
            if sum(s) > i:
                l.append(i)
        return l
    
    start = time.time() #start time
    
    limit = 28123
    
    anums = findanums(limit + 1) #abundant numbers (1..limit)
    print "done finding abundants", time.time() - start
    
    pairs = combinations(anums, 2)
    print "done finding combinations", time.time() - start
    
    sums = map(lambda x: x[0]+x[1], pairs)
    print "done finding all possible sums", time.time() - start
    
    print "start main loop"
    answer = 0
    for i in range(1,limit+1):
        if i not in sums:
            answer += i
    print "ANSWER:",answer
    

    When I run this I run into a MemoryError.

    The traceback:

    File "test.py", line 20, in <module>
        sums = map(lambda x: x[0]+x[1], pairs)
    

    I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.

    SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

  • Jesse Neff
    Jesse Neff about 11 years
    Now it's just spitting out a wrong answer. You have given me a lot to read and learn so for that I thank you. The generator did indeed solve my memory issue!
  • Jesse Neff
    Jesse Neff about 11 years
    What do you mean by combinations returns a list when it's wrong?
  • Ofiris
    Ofiris about 11 years
    Good luck with the Project Euler.
  • Ofiris
    Ofiris about 11 years
    I said combination returns a list and its not correct, fixed it anyhow.
  • Jesse Neff
    Jesse Neff about 11 years
    As I said in my post I'm still quite green and numpy's magic is still out of my grasp at the moment as I am still learning the normal Python libraries. But I appreciate this answer nonetheless as it gives me a taste of what I'd be able to do once I learn enough to use numpy/scipy!
  • Alex
    Alex about 11 years
    combinations is a generator thus you don't need to worry about pairs = combinations(anums, 2), it's not memory hungry. And is very good practice :)
  • Jesse Neff
    Jesse Neff about 11 years
    @Alex, I figured as much, it was the mapping of the sums of the tuples that gave me a memory error.
  • fijal
    fijal about 11 years
    range does not generate a list in pypy either