Fastest Way to generate 1,000,000+ random numbers in python

python performance random numpy

33,470

Solution 1

You can speed things up a bit from what mtrw posted above just by doing what you initially described (generating a bunch of random numbers and multiplying and dividing accordingly)...

Also, you probably already know this, but be sure to do the operations in-place (*=, /=, +=, etc) when working with large-ish numpy arrays. It makes a huge difference in memory usage with large arrays, and will give a considerable speed increase, too.

In [53]: def rand_row_doubles(row_limits, num):
   ....:     ncols = len(row_limits)
   ....:     x = np.random.random((num, ncols))
   ....:     x *= row_limits                  
   ....:     return x                          
   ....:                                       
In [59]: %timeit rand_row_doubles(np.arange(7) + 1, 1000000)
10 loops, best of 3: 187 ms per loop

As compared to:

In [66]: %timeit ManyRandDoubles(np.arange(7) + 1, 1000000)
1 loops, best of 3: 222 ms per loop

It's not a huge difference, but if you're really worried about speed, it's something.

Just to show that it's correct:

In [68]: x.max(0)
Out[68]:
array([ 0.99999991,  1.99999971,  2.99999737,  3.99999569,  4.99999836,
        5.99999114,  6.99999738])

In [69]: x.min(0)
Out[69]:
array([  4.02099599e-07,   4.41729377e-07,   4.33480302e-08,
         7.43497138e-06,   1.28446819e-05,   4.27614385e-07,
         1.34106753e-05])

Likewise, for your "rows sum to one" part...

In [70]: def rand_rows_sum_to_one(nrows, ncols):
   ....:     x = np.random.random((ncols, nrows))
   ....:     y = x.sum(axis=0)
   ....:     x /= y
   ....:     return x.T
   ....:

In [71]: %timeit rand_rows_sum_to_one(1000000, 13)
1 loops, best of 3: 455 ms per loop

In [72]: x = rand_rows_sum_to_one(1000000, 13)

In [73]: x.sum(axis=1)
Out[73]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])

Honestly, even if you re-implement things in C, I'm not sure you'll be able to beat numpy by much on this one... I could be very wrong, though!

Solution 2

EDIT Created functions that return the full set of numbers, not just one row at a time. EDIT 2 Make the functions more pythonic (and faster), add solution for second question

For the first set of numbers, you might consider numpy.random.randint or numpy.random.uniform, which take low and high parameters. Generating an array of 7 x 1,000,000 numbers in a specified range seems to take < 0.7 second on my 2 GHz machine:

def LimitedRandInts(XLim, N):
    rowlen = (1,N)
    return [np.random.randint(low=0,high=lim,size=rowlen) for lim in XLim]

def LimitedRandDoubles(XLim, N):
    rowlen = (1,N)
    return [np.random.uniform(low=0,high=lim,size=rowlen) for lim in XLim]

>>> import numpy as np
>>> N = 1000000 #number of randoms in each range
>>> xLim = [x*500 for x in range(1,8)] #convenient limit generation
>>> fLim = [x/7.0 for x in range(1,8)]
>>> aa = LimitedRandInts(xLim, N)
>>> ff = LimitedRandDoubles(fLim, N)

This returns integers in [0,xLim-1] or floats in [0,fLim). The integer version took ~0.3 seconds, the double ~0.66, on my 2 GHz single-core machine.

For the second set, I used @Joe Kingston's suggestion.

def SumToOneRands(NumToSum, N):
    aa = np.random.uniform(low=0,high=1.0,size=(NumToSum,N)) #13 rows by 1000000 columns, for instance
    s = np.reciprocal(aa.sum(0))
    aa *= s
    return aa.T #get back to column major order, so aa[k] is the kth set of 13 numbers

>>> ll = SumToOneRands(13, N)

This takes ~1.6 seconds.

In all cases, result[k] gives you the kth set of data.

Solution 3

Try r = 1664525*r + 1013904223
from "an even quicker generator" in "Numerical Recipes in C" 2nd edition, Press et al., isbn 0521431085, p. 284.
np.random is certainly "more random"; see Linear congruential generator .

In python, use np.uint32 like this:

python -mtimeit -s '
import numpy as np
r = 1
r = np.array([r], np.uint32)[0]  # 316 py -> 16 us np 
    # python longs can be arbitrarily long, so slow
' '
r = r*1664525 + 1013904223  # NR2 p. 284
'

To generate big blocks at a time:

# initialize --
np.random.seed( ... )
R = np.random.randint( 0, np.iinfo( np.uint32 ).max, size,  dtype=np.uint32 )
...
R *= 1664525
R += 1013904223

Solution 4

Making your code run in parallel certainly couldn't hurt. Try adapting it for SMP with Parallel Python

Solution 5

As others have already pointed out, numpy is a very good start, fast and easy to use.

If you need random numbers on a massive scale, consider eas-ecb or rc4. Both can be parallelised, you should reach performance in several GB/s.

achievable numbers posted here

View more solutions

33,470

Sandro

Updated on December 22, 2021

Comments

Sandro over 2 years
I am currently writing an app in python that needs to generate large amount of random numbers, FAST. Currently I have a scheme going that uses numpy to generate all of the numbers in a giant batch (about ~500,000 at a time). While this seems to be faster than python's implementation. I still need it to go faster. Any ideas? I'm open to writing it in C and embedding it in the program or doing w/e it takes.

Constraints on the random numbers:
- A Set of 7 numbers that can all have different bounds:
  - eg: [0-X1, 0-X2, 0-X3, 0-X4, 0-X5, 0-X6, 0-X7]
  - Currently I am generating a list of 7 numbers with random values from [0-1) then multiplying by [X1..X7]
- A Set of 13 numbers that all add up to 1
  - Currently just generating 13 numbers then dividing by their sum
Any ideas? Would pre calculating these numbers and storing them in a file make this faster?

Thanks!
- John Feminella about 14 years
  
  It's pretty much a guarantee that going to disk I/O will not make it faster, so the file storage approach is probably not what you want.
- Anycorn about 14 years
  
  what is the range of numbers? I assume they are floating-point? how much faster do you have to go? how much randomness do you need? can you get by generating N/7 numbers and rotating them?m
- poke about 14 years
  
  How important is it that those numbers are randomly generated when you need them? Would it be an option to store maybe 5 times as much previously generated random numbers and just select a random set of those?
- Sandro about 14 years
  
  OK so I/O is out of the question (Seems obvious enough) Right now they are floating point, but I can swing to ints Currently I'm at ~5 seconds per generation, I need sub 1 second The randomness is important, so I can't get too fake
- jfs about 14 years
  
  Could you provide your slow implementation. It takes 12 milliseconds to generate 1000_000 random integers. How much faster do you need? python -mtimeit -s'import numpy as np' 'np.random.randint(low=0, high=500, size=(1000000,1))' -> 100 loops, best of 3: 11.9 msec per loop
- msw about 14 years
  
  Amplifying aaa, without knowing the application you are putting them to, it is hard to answer. There are much faster psuedo-random number generators than the default in module random which have smaller periods which - depending on the application - may be suitable.
- President James K. Polk about 14 years
  
  [4] * 1000000 should be fastest. However, you may find the xkcd random number generator does not meet your needs as far as randomness goes. Can you say what those needs are?
- Anycorn about 14 years
  
  can you use floats rather than double? I think numpy defaults to doubles.this should cut your memory by half (and bandwidth)
Anycorn about 14 years

you may win few cycles if you multiply by inverse instead of dividing in floating-point performance.
Sandro about 14 years

I'll have to give that a whack. Do you know the performance of stacking arrays horizontally (not sure how to word this) to combine the arrays?
mtrw about 14 years

@aaa - Thanks, I pulled your suggestion into the answer. @Sandro - I think stack is not great. You might be able to preallocate the array. I'll see if I can make that work and update the answer.
Anycorn about 14 years

other thing you can do is generate random N/13 numbers and rotate them clockwise or counter clockwise. this will generate random sets (but not random members in general). Really need to know where bottleneck is
Sandro about 14 years

Actually due to the large memory required, copy the memory or sending it over a pipe is quiet expensive and so far has actually been slowing me down.
mtrw about 14 years

@J.F. Sebastian - whoops. Thanks for catching that. @aaa - I get the impression the OP wants each set to add to 1 exactly. I don't see that in your second suggestion. Am I missing something?
mtrw about 14 years

@Joe - I tried your method for the limited numbers and found it to be slower on my machine. I'm curious if you could try mine and compare? I also stole your method for the sum-to-1 numbers; it was way faster than what I was trying before. Thanks!
Joe Kington about 14 years

@mtrw- Your updated functions are faster than mine by a fair bit, now. (166ms vs 184ms) Yours don't require that the entire chunk of memory be contiguous, just the memory for each column, which I think is what's mostly causing the difference. The downside is in accessing the data after it's created. You'll have to use list comprehensions (or similar) for yours, whereas mine returns a single 2D numpy array (slightly faster and much more flexible indexing). It doesn't matter much if you just need to access one row at a time, though. Cheers!
Sandro about 14 years

Woah that's a lot of work you put into it! I'm sorry about that. So it looks like you guys are good, and sure enough you're making numpy fly for me. The only problem that I'm having is trying to get the dimensions to be correct. For the first case the shape isn't quite right (I'm trying to get it myself). But basically for Xlim, N I need N arrays with len(Xlim) elements where element i of the array is no larger than Xlim[i]. It looks like it goes a little bit backwards in your code. And yes for the second problem, I need the sum of all of the elements in a given set to be equal to 1
Sandro about 14 years

OK I used numpy.column_stack to get what I needed. Seems to be doing the right thing. Now after I generate these values I sadly have to convert them into tuples (The arrays from the first case) and store them into a dict. Any tips on that?
mtrw about 14 years

@Sandro - I think Joe Kington's solution is more what you're looking for in terms of getting the output arrays in the right shape. The slight speed advantage the list comprehensions have probably get lost in reformatting for shape, so I'd go with his answer.