Why does C++ rand() seem to generate only numbers of the same order of magnitude?

c++ c math random

10,015

Solution 1

There are only 3% of numbers between 1 and 2³⁰ which are NOT between 2²⁵ and 2³⁰. So, this sounds pretty normal :)

Because 2²⁵ / 2³⁰ = 2^-5 = 1/32 = 0.03125 = 3.125%

Solution 2

The lighter green is the region between 0 and 2²⁵; the darker green is the region between 2²⁵ and 2³⁰. The ticks are powers of 2.

distribution

Solution 3

You need to be more precise: you want different base 2 logarithm values but what distribution do you want for this? The standard rand() functions generate a uniform distribution, you will need to transform this output using the quantile function associated with the distribution that you want.

If you tell us the distribution then we can tell you the quantile function you need.

Solution 4

If you want different orders of magnitude, why not simply try pow(2, rand())? Or perhaps choose the order directly as rand(), as Harold suggested?

Solution 5

@C4stor made a great point. But, for a more general case and easier to understand for human (base 10): for the range from 1 to 10^n, ~90% of the numbers are from 10^(n-1) to 10^n, therefore, ~99% of the numbers go from 10^(n-2) to 10^n. Keep adding as many decimals as you want.

Funny mathematics, if you keep doing this for n, you can see that from 1 to 10^n, 99.9999...% = 100% of the numbers are from 10^0 to 10^n with this method.

Now about code, if you want a random number with random orders of magnitude, from 0 to 10^n, you could do:

Generate a small random number from 0 to n
If you know the range that n has, generate a big random number of order 10^k where k > max{n}.
Cut the longer random number to get the n digits of this big random number.

View more solutions

10,015

Author by

Tallaron Mathias

Updated on June 14, 2022

Comments

Tallaron Mathias almost 2 years

In a small application written in C/C++, I am facing a problem with the rand function and maybe the seed :

I want to produce a sequence of random numbers that are of different orders, i.e. with different logarithm values (base 2). But it seems that all the numbers produced are of the same order, fluctuating just between 2^25 and 2^30.

Is it because rand() is seeded with Unix time which is by now a relatively big number? What am I forgetting ? I am seeding rand() only once at the beginning of the main().
Tallaron Mathias almost 11 years

Aye, good point ! There are 31 times more numbers between 2^25 and 2^30 than between 1 and 2^25 :) thanks for the quick answer. I need to rethink the program then. Question answered.
kriss almost 11 years

good idea, but you should fix your answer using pow instead of ^ (which is the logical xor operator, not power, in C language).
Ask About Monica almost 11 years

You're completely correct, but for a REALLY easy to understand answer, the OP should ask himself why 90% of the random numbers between 1 and 100 are two digits.
Sean Allred almost 11 years

@TallaronMathias Consider truncating the number through >> bitshifting -- this will give you smaller numbers. (Or taking a modulus with %.)
BlueRaja - Danny Pflughoeft almost 11 years

I would expect this to be obvious to most programmers: Any unsigned integer less than 2^25 must have its first 7 bits equal to 0 - and if every bit is random...
Brett Hale almost 11 years

@BlueRaja-DannyPflughoeft - if probabilities were obvious, casinos would be out of business.
leftaroundabout almost 11 years

+1, distribution is the crucial term. It doesn't really make sense to talk about random numbers when nothing is known about the distribution. Uniform is just a special case, albeit an important one. Might be a good place to point out various distributions from the C++11 standard library.
EkoostikMartin almost 11 years

@BrettHale - I don't think programmers are a casino's target demographic though.
Ant almost 11 years

same answer as the problem : "What is the probability of getting 5 tails in 5 toss of a fair coin? " :-)
Floris almost 11 years

Since rand() can go up to RAND_MAX, you really need to scale your random number so the result doesn't overflow...
Mooing Duck almost 11 years

nitpick: this encourages very small numbers more than one might expect. The probability of getting a zero is significantly higher than a 10.
André Caron almost 11 years

@Floris: but if you scale a small countable range on a very large range, you will have LOTS of holes, which is probably not what OP is expecting.
gkimsey almost 11 years

Side note: To get a random 32-bit integer with a fairly even distribution of base 2 orders of magnitude, you could try: rand() & ((1 << (rand() % 33)) - 1). There may be a somewhat cleaner way to do it but that's the simplest I could come up with quickly. Actually kind of a neat problem.
Floris almost 11 years

Well - the whole point of this is to encourage small numbers, so I'm glad it's working! I ran a Monte Carlo simulation, and this is giving me a factor 2 drop in probability as numbers double - not unlike a log distribution. Updated answer with a picture.
Mooing Duck almost 11 years

no, I mean, with rand()>>(rand()&31);, one would intuitively expect 1/32nd of the numbers to have 32 bits, and 1/32nd of the numbers to have 31 bits, and 1/32nd of the numbers to have 30 bits, etc. But that's not the results you're getting, only about than 1/64th of the numbers would result in 32 bits, while almost half should be 0. Since my mental math disagrees with your measurements, I'll have to do my own measurements to figure this out.
Mooing Duck almost 11 years

Yeah, my measurements match my theory. Since your x-axis isn't logarithmic, you're overlooking the problem.
Mooing Duck almost 11 years

I don't mean to say your code is wrong. It's probably what I would do. It just deserves a warning that the results aren't quite distributed as one might expect.
Mooing Duck almost 11 years

Also note theres no results with 32 bits
Mooing Duck almost 11 years

This variant is slower but gives 1-32 bits with an even distribution.
Floris almost 11 years

Hmmm that's strange. I am missing something obvious - how is your code getting away with block[log2(0)]++;? NVM - I just noticed how you defined log2. I will have to think more about this. Thanks for the feedback and code samples!
Mooing Duck almost 11 years

It occurs to my that my "evenly distributed" version will never come up with a zero. Oh well.
Floris almost 11 years

I think the problem comes from thinking of 0 as a 1 bit number... that's the kind of conundrum you run into when you mix integers and logarithms. It's been a good exercise though and you gave me something to think about. "Test the limits of your algorithm" - it never gets old.
peterh over 10 years

Yes. Make a random float between 0 and 1, multiple with 32 (as float), take its logarithm (as float), divide with the logarithm of 2 (as float), take the exponential of the result (as float), and finally convert this to integer. You will get floating numbers, but in a logarithmical distribution. I think to this: int(exp(log(rand()*32)/log(2))). That means, that the numbers with lower exponents will come with bigger occurrence, for example 2^10<x<2^11 will have the same probability as 2^11<x<2^12 .
Shai over 10 years

Welcome to SO. please refrain from posting links as answers. You can provide a detailed sketch of an answer leaving the details to be read via links.