Why does C++ rand() seem to generate only numbers of the same order of magnitude?

10,015

Solution 1

There are only 3% of numbers between 1 and 230 which are NOT between 225 and 230. So, this sounds pretty normal :)

Because 225 / 230 = 2-5 = 1/32 = 0.03125 = 3.125%

Solution 2

The lighter green is the region between 0 and 225; the darker green is the region between 225 and 230. The ticks are powers of 2.

distribution

Solution 3

You need to be more precise: you want different base 2 logarithm values but what distribution do you want for this? The standard rand() functions generate a uniform distribution, you will need to transform this output using the quantile function associated with the distribution that you want.

If you tell us the distribution then we can tell you the quantile function you need.

Solution 4

If you want different orders of magnitude, why not simply try pow(2, rand())? Or perhaps choose the order directly as rand(), as Harold suggested?

Solution 5

@C4stor made a great point. But, for a more general case and easier to understand for human (base 10): for the range from 1 to 10^n, ~90% of the numbers are from 10^(n-1) to 10^n, therefore, ~99% of the numbers go from 10^(n-2) to 10^n. Keep adding as many decimals as you want.

Funny mathematics, if you keep doing this for n, you can see that from 1 to 10^n, 99.9999...% = 100% of the numbers are from 10^0 to 10^n with this method.

Now about code, if you want a random number with random orders of magnitude, from 0 to 10^n, you could do:

  1. Generate a small random number from 0 to n

  2. If you know the range that n has, generate a big random number of order 10^k where k > max{n}.

  3. Cut the longer random number to get the n digits of this big random number.

Share:
10,015
Tallaron Mathias
Author by

Tallaron Mathias

Updated on June 14, 2022

Comments

  • Tallaron Mathias
    Tallaron Mathias almost 2 years

    In a small application written in C/C++, I am facing a problem with the rand function and maybe the seed :

    I want to produce a sequence of random numbers that are of different orders, i.e. with different logarithm values (base 2). But it seems that all the numbers produced are of the same order, fluctuating just between 2^25 and 2^30.

    Is it because rand() is seeded with Unix time which is by now a relatively big number? What am I forgetting ? I am seeding rand() only once at the beginning of the main().

  • Tallaron Mathias
    Tallaron Mathias almost 11 years
    Aye, good point ! There are 31 times more numbers between 2^25 and 2^30 than between 1 and 2^25 :) thanks for the quick answer. I need to rethink the program then. Question answered.
  • kriss
    kriss almost 11 years
    good idea, but you should fix your answer using pow instead of ^ (which is the logical xor operator, not power, in C language).
  • Ask About Monica
    Ask About Monica almost 11 years
    You're completely correct, but for a REALLY easy to understand answer, the OP should ask himself why 90% of the random numbers between 1 and 100 are two digits.
  • Sean Allred
    Sean Allred almost 11 years
    @TallaronMathias Consider truncating the number through >> bitshifting -- this will give you smaller numbers. (Or taking a modulus with %.)
  • BlueRaja - Danny Pflughoeft
    BlueRaja - Danny Pflughoeft almost 11 years
    I would expect this to be obvious to most programmers: Any unsigned integer less than 2^25 must have its first 7 bits equal to 0 - and if every bit is random...
  • Brett Hale
    Brett Hale almost 11 years
    @BlueRaja-DannyPflughoeft - if probabilities were obvious, casinos would be out of business.
  • leftaroundabout
    leftaroundabout almost 11 years
    +1, distribution is the crucial term. It doesn't really make sense to talk about random numbers when nothing is known about the distribution. Uniform is just a special case, albeit an important one. Might be a good place to point out various distributions from the C++11 standard library.
  • EkoostikMartin
    EkoostikMartin almost 11 years
    @BrettHale - I don't think programmers are a casino's target demographic though.
  • Ant
    Ant almost 11 years
    same answer as the problem : "What is the probability of getting 5 tails in 5 toss of a fair coin? " :-)
  • Floris
    Floris almost 11 years
    Since rand() can go up to RAND_MAX, you really need to scale your random number so the result doesn't overflow...
  • Mooing Duck
    Mooing Duck almost 11 years
    nitpick: this encourages very small numbers more than one might expect. The probability of getting a zero is significantly higher than a 10.
  • André Caron
    André Caron almost 11 years
    @Floris: but if you scale a small countable range on a very large range, you will have LOTS of holes, which is probably not what OP is expecting.
  • gkimsey
    gkimsey almost 11 years
    Side note: To get a random 32-bit integer with a fairly even distribution of base 2 orders of magnitude, you could try: rand() & ((1 << (rand() % 33)) - 1). There may be a somewhat cleaner way to do it but that's the simplest I could come up with quickly. Actually kind of a neat problem.
  • Floris
    Floris almost 11 years
    Well - the whole point of this is to encourage small numbers, so I'm glad it's working! I ran a Monte Carlo simulation, and this is giving me a factor 2 drop in probability as numbers double - not unlike a log distribution. Updated answer with a picture.
  • Mooing Duck
    Mooing Duck almost 11 years
    no, I mean, with rand()>>(rand()&31);, one would intuitively expect 1/32nd of the numbers to have 32 bits, and 1/32nd of the numbers to have 31 bits, and 1/32nd of the numbers to have 30 bits, etc. But that's not the results you're getting, only about than 1/64th of the numbers would result in 32 bits, while almost half should be 0. Since my mental math disagrees with your measurements, I'll have to do my own measurements to figure this out.
  • Mooing Duck
    Mooing Duck almost 11 years
    Yeah, my measurements match my theory. Since your x-axis isn't logarithmic, you're overlooking the problem.
  • Mooing Duck
    Mooing Duck almost 11 years
    I don't mean to say your code is wrong. It's probably what I would do. It just deserves a warning that the results aren't quite distributed as one might expect.
  • Mooing Duck
    Mooing Duck almost 11 years
    Also note theres no results with 32 bits
  • Mooing Duck
    Mooing Duck almost 11 years
  • Floris
    Floris almost 11 years
    Hmmm that's strange. I am missing something obvious - how is your code getting away with block[log2(0)]++;? NVM - I just noticed how you defined log2. I will have to think more about this. Thanks for the feedback and code samples!
  • Mooing Duck
    Mooing Duck almost 11 years
    It occurs to my that my "evenly distributed" version will never come up with a zero. Oh well.
  • Floris
    Floris almost 11 years
    I think the problem comes from thinking of 0 as a 1 bit number... that's the kind of conundrum you run into when you mix integers and logarithms. It's been a good exercise though and you gave me something to think about. "Test the limits of your algorithm" - it never gets old.
  • peterh
    peterh over 10 years
    Yes. Make a random float between 0 and 1, multiple with 32 (as float), take its logarithm (as float), divide with the logarithm of 2 (as float), take the exponential of the result (as float), and finally convert this to integer. You will get floating numbers, but in a logarithmical distribution. I think to this: int(exp(log(rand()*32)/log(2))). That means, that the numbers with lower exponents will come with bigger occurrence, for example 2^10<x<2^11 will have the same probability as 2^11<x<2^12 .
  • Shai
    Shai over 10 years
    Welcome to SO. please refrain from posting links as answers. You can provide a detailed sketch of an answer leaving the details to be read via links.