Is there a way to avoid this memory error?
Solution 1
As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.
When using very big lists, it is common to use generators
, there is a famous, great Stackoverflow answer explaining the concepts of yield
and generator
- What does the "yield" keyword do in Python?
The problem is that your pairs = combinations(anums, 2)
doesn't generate a generator
, but generates a large object which is much more memory consuming.
I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:
def generator_sol(anums1, s):
for comb in itertools.combinations(anums1, s):
yield comb
Now instead of pairs = combinations(anums, 2)
which generates a large object.
Use:
pairs = generator_sol(anums, 2)
Then, instead of using the lambda
I would use another generator
:
sums_sol = (x[0]+x[1] for x in pairs)
Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object
which is more efficient in many cases (such as here).
Solution 2
Let me suggest you to use generators. Try changing this:
sums = map(lambda x: x[0]+x[1], pairs)
to
sums = (a+b for (a,b) in pairs)
Ofiris solution is also ok but implies that itertools.combinations
return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.
Solution 3
The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:
import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()
Jesse Neff
Updated on October 24, 2022Comments
-
Jesse Neff over 1 year
I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.
from itertools import combinations import time def findanums(n): l = [] for i in range(1, n + 1): s = [] for j in range(1, i): if i % j == 0: s.append(j) if sum(s) > i: l.append(i) return l start = time.time() #start time limit = 28123 anums = findanums(limit + 1) #abundant numbers (1..limit) print "done finding abundants", time.time() - start pairs = combinations(anums, 2) print "done finding combinations", time.time() - start sums = map(lambda x: x[0]+x[1], pairs) print "done finding all possible sums", time.time() - start print "start main loop" answer = 0 for i in range(1,limit+1): if i not in sums: answer += i print "ANSWER:",answer
When I run this I run into a
MemoryError
.The traceback:
File "test.py", line 20, in <module> sums = map(lambda x: x[0]+x[1], pairs)
I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.
SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.
-
Jesse Neff about 11 yearsNow it's just spitting out a wrong answer. You have given me a lot to read and learn so for that I thank you. The generator did indeed solve my memory issue!
-
Jesse Neff about 11 yearsWhat do you mean by
combinations
returns a list when it's wrong? -
Ofiris about 11 yearsGood luck with the Project Euler.
-
Ofiris about 11 yearsI said combination returns a
list
and its not correct, fixed it anyhow. -
Jesse Neff about 11 yearsAs I said in my post I'm still quite green and numpy's magic is still out of my grasp at the moment as I am still learning the normal Python libraries. But I appreciate this answer nonetheless as it gives me a taste of what I'd be able to do once I learn enough to use numpy/scipy!
-
Alex about 11 yearscombinations is a generator thus you don't need to worry about
pairs = combinations(anums, 2)
, it's not memory hungry. And is very good practice :) -
Jesse Neff about 11 years@Alex, I figured as much, it was the mapping of the sums of the tuples that gave me a memory error.
-
fijal about 11 yearsrange does not generate a list in pypy either