Is java.util.Random really that random? How can I generate 52! (factorial) possible sequences?

12,795

Solution 1

Selecting a random permutation requires simultaneously more and less randomness than what your question implies. Let me explain.

The bad news: need more randomness.

The fundamental flaw in your approach is that it's trying to choose between ~2226 possibilities using 64 bits of entropy (the random seed). To fairly choose between ~2226 possibilities you're going to have to find a way to generate 226 bits of entropy instead of 64.

There are several ways to generate random bits: dedicated hardware, CPU instructions, OS interfaces, online services. There is already an implicit assumption in your question that you can somehow generate 64 bits, so just do whatever you were going to do, only four times, and donate the excess bits to charity. :)

The good news: need less randomness.

Once you have those 226 random bits, the rest can be done deterministically and so the properties of java.util.Random can be made irrelevant. Here is how.

Let's say we generate all 52! permutations (bear with me) and sort them lexicographically.

To choose one of the permutations all we need is a single random integer between 0 and 52!-1. That integer is our 226 bits of entropy. We'll use it as an index into our sorted list of permutations. If the random index is uniformly distributed, not only are you guaranteed that all permutations can be chosen, they will be chosen equiprobably (which is a stronger guarantee than what the question is asking).

Now, you don't actually need to generate all those permutations. You can produce one directly, given its randomly chosen position in our hypothetical sorted list. This can be done in O(n2) time using the Lehmer[1] code (also see numbering permutations and factoriadic number system). The n here is the size of your deck, i.e. 52.

There is a C implementation in this StackOverflow answer. There are several integer variables there that would overflow for n=52, but luckily in Java you can use java.math.BigInteger. The rest of the computations can be transcribed almost as-is:

public static int[] shuffle(int n, BigInteger random_index) {
    int[] perm = new int[n];
    BigInteger[] fact = new BigInteger[n];
    fact[0] = BigInteger.ONE;
    for (int k = 1; k < n; ++k) {
        fact[k] = fact[k - 1].multiply(BigInteger.valueOf(k));
    }

    // compute factorial code
    for (int k = 0; k < n; ++k) {
        BigInteger[] divmod = random_index.divideAndRemainder(fact[n - 1 - k]);
        perm[k] = divmod[0].intValue();
        random_index = divmod[1];
    }

    // readjust values to obtain the permutation
    // start from the end and check if preceding values are lower
    for (int k = n - 1; k > 0; --k) {
        for (int j = k - 1; j >= 0; --j) {
            if (perm[j] <= perm[k]) {
                perm[k]++;
            }
        }
    }

    return perm;
}

public static void main (String[] args) {
    System.out.printf("%s\n", Arrays.toString(
        shuffle(52, new BigInteger(
            "7890123456789012345678901234567890123456789012345678901234567890"))));
}

[1] Not to be confused with Lehrer. :)

Solution 2

Your analysis is correct: seeding a pseudo-random number generator with any specific seed must yield the same sequence after a shuffle, limiting the number of permutations that you could obtain to 264. This assertion is easy to verify experimentally by calling Collection.shuffle twice, passing a Random object initialized with the same seed, and observing that the two random shuffles are identical.

A solution to this, then, is to use a random number generator that allows for a larger seed. Java provides SecureRandom class that could be initialized with byte[] array of virtually unlimited size. You could then pass an instance of SecureRandom to Collections.shuffle to complete the task:

byte seed[] = new byte[...];
Random rnd = new SecureRandom(seed);
Collections.shuffle(deck, rnd);

Solution 3

In general, a pseudorandom number generator (PRNG) can't choose from among all permutations of a 52-item list if its maximum cycle length is less than 226 bits.

java.util.Random implements an algorithm with a modulus of 248 and a maximum cycle length of only 48 bits, so much less than the 226 bits I referred to. You will need to use another PRNG with a bigger cycle length, specifically one with a maximum cycle length of 52 factorial or greater.

See also "Shuffling" in my article on random number generators.

This consideration is independent of the nature of the PRNG; it applies equally to cryptographic and noncryptographic PRNGs (of course, noncryptographic PRNGs are inappropriate whenever information security is involved).


Although java.security.SecureRandom allows seeds of unlimited length to be passed in, the SecureRandom implementation could use an underlying PRNG (e.g., "SHA1PRNG" or "DRBG"). And it depends on that PRNG's maximum cycle length whether it's capable of choosing from among 52 factorial permutations.

Solution 4

Let me apologize in advance, because this is a little tough to understand...

First of all, you already know that java.util.Random is not completely random at all. It generates sequences in a perfectly predictable way from the seed. You are completely correct that, since the seed is only 64 bits long, it can only generate 2^64 different sequences. If you were to somehow generate 64 real random bits and use them to select a seed, you could not use that seed to randomly choose between all of the 52! possible sequences with equal probability.

However, this fact is of no consequence as long as you're not actually going to generate more than 2^64 sequences, as long as there is nothing 'special' or 'noticeably special' about the 2^64 sequences that it can generate.

Lets say you had a much better PRNG that used 1000-bit seeds. Imagine you had two ways to initialize it -- one way would initialize it using the whole seed, and one way would hash the seed down to 64 bits before initializing it.

If you didn't know which initializer was which, could you write any kind of test to distinguish them? Unless you were (un)lucky enough to end up initializing the bad one with the same 64 bits twice, then the answer is no. You could not distinguish between the two initializers without some detailed knowledge of some weakness in the specific PRNG implementation.

Alternatively, imagine that the Random class had an array of 2^64 sequences that were selected completely and random at some time in the distant past, and that the seed was just an index into this array.

So the fact that Random uses only 64 bits for its seed is actually not necessarily a problem statistically, as long as there is no significant chance that you will use the same seed twice.

Of course, for cryptographic purposes, a 64 bit seed is just not enough, because getting a system to use the same seed twice is computationally feasible.

EDIT:

I should add that, even though all of the above is correct, that the actual implementation of java.util.Random is not awesome. If you are writing a card game, maybe use the MessageDigest API to generate the SHA-256 hash of "MyGameName"+System.currentTimeMillis(), and use those bits to shuffle the deck. By the above argument, as long as your users are not really gambling, you don't have to worry that currentTimeMillis returns a long. If your users are really gambling, then use SecureRandom with no seed.

Solution 5

I'm going to take a bit of a different tack on this. You're right on your assumptions - your PRNG isn't going to be able to hit all 52! possibilities.

The question is: what's the scale of your card game?

If you're making a simple klondike-style game? Then you definitely don't need all 52! possibilities. Instead, look at it like this: a player will have 18 quintillion distinct games. Even accounting for the 'Birthday Problem', they'd have to play billions of hands before they'd run into the first duplicate game.

If you're making a monte-carlo simulation? Then you're probably okay. You might have to deal with artifacts due to the 'P' in PRNG, but you're probably not going to run into problems simply due to a low seed space (again, you're looking at quintillions of unique possibilities.) On the flip side, if you're working with large iteration count, then, yeah, your low seed space might be a deal-breaker.

If you're making a multiplayer card game, particularly if there's money on the line? Then you're going to need to do some googling on how the online poker sites handled the same problem you're asking about. Because while the low seed space issue isn't noticeable to the average player, it is exploitable if it's worth the time investment. (The poker sites all went through a phase where their PRNGs were 'hacked', letting someone see the hole cards of all the other players, simply by deducing the seed from exposed cards.) If this is the situation you're in, don't simply find a better PRNG - you'll need to treat it as seriously as a Crypto problem.

Share:
12,795

Related videos on Youtube

Sergei Emelianov
Author by

Sergei Emelianov

Android developer, happy father, watch collector...

Updated on July 21, 2022

Comments

  • Sergei Emelianov
    Sergei Emelianov almost 2 years

    I've been using Random (java.util.Random) to shuffle a deck of 52 cards. There are 52! (8.0658175e+67) possibilities. Yet, I've found out that the seed for java.util.Random is a long, which is much smaller at 2^64 (1.8446744e+19).

    From here, I'm suspicious whether java.util.Random is really that random; is it actually capable of generating all 52! possibilities?

    If not, how can I reliably generate a better random sequence that can produce all 52! possibilities?

    • T.J. Crowder
      T.J. Crowder almost 6 years
      "how can I surely generate a real random number over 52!" The numbers from Random are never real random numbers. It's a PRNG, where P stands for "pseudo." For real random numbers, you need a source of randomness (such as random.org).
    • Radiodef
      Radiodef almost 6 years
      How are you shuffling the deck of cards? (Collections.shuffle, or something else?)
    • Sergey Kalinichenko
      Sergey Kalinichenko almost 6 years
      @JimGarrison That is not what OP's after. He's talking about 10^68 possible sequences. Since each pseudo-random sequence is identified by its seed, OP says there could be at the most 2^64 different sequences.
    • Sergei Emelianov
      Sergei Emelianov almost 6 years
      @Radiodef - Yes, exactly that I was doing so far
    • chrylis -cautiouslyoptimistic-
      chrylis -cautiouslyoptimistic- almost 6 years
      OP: I've reworded your question substantially, but I believe it accurately reflects the concern you have and the problem you're trying to solve. If I misunderstood, feel free to re-edit or roll back.
    • Sergei Emelianov
      Sergei Emelianov almost 6 years
      @chrylis -Yes, that's exactly what I meant to say, thank you!
    • NPE
      NPE almost 6 years
      I think it's an interesting question, and is worth thinking about. But I can't help wondering about your problem context: what it is exactly that's leading to the requirement to be able to generate all 52! permutations? For example, in real-world bridge we can shuffle the deck and deal one card at a time, yet there are only ~6e11 different hands since many different permutations result in the same hand. Thinking in the other direction, do you need a solution specifically for 52!, or do you need one that generalizes to, say, two decks shuffled together (104!/(2**52) possibilities, or ~2e150)?
    • Sergei Emelianov
      Sergei Emelianov almost 6 years
      @NPE - Take Solitaire (Klondike) for instance, 52! is exactly the number of possible hands..
    • NPE
      NPE almost 6 years
      @SerjArdovic: Awesome, thank you for that example!
    • Shmoopy
      Shmoopy almost 6 years
      Can't you choose a random position for each card instead of generatiing a random sequence at one go? Meaning, pick a random position for the first card (1..52) and then pick a random position for the second card (1..52 minus the first card's position) and so on..
    • Radiodef
      Radiodef almost 6 years
      @Shmoopy That generates a sequence. (1/52, 1/51, 1/50, ..., 1/1.) It's also what the OP is already doing. See their comment replying to mine confirming that they used Collections.shuffle, which is a Fisher-Yates shuffle.
    • Giacomo Alzetta
      Giacomo Alzetta almost 6 years
      Note that factorial grows strictly faster than exponential. This means that even using SecureRandom with a byte[] will not work if you increase the number of cards enough... sure you can increase the size of the seed, but unfortunately you need to increase the seed size asymptotically more to be able to keep generating all possible permutations, and hence even for """relatively small N number""" you will not be able to generate all permutations of N cards even using all memory storage on earth.
    • Dennis_E
      Dennis_E almost 6 years
      I think this is an interesting read: superuser.com/a/712583
    • caitlin lopez
      caitlin lopez almost 6 years
      nope, the java random is not secure nor random. SecureRandom is a little more random but I will trust more to bouncy castle random generators or openssh
    • Hans-Peter Störr
      Hans-Peter Störr almost 6 years
      BTW, why would you want it to generate all 52! possibilities? I guess from the standpoint of a user all that matters is that it's not predictable, and that's something different, entirely. After all, the game could just enumerate all 52! possibilities, and be very predictable...
  • Sergei Emelianov
    Sergei Emelianov almost 6 years
    Am I right to think that byte[] should be another random seed which I should get from entropy measurement (function of temperature, location, time)?
  • NPE
    NPE almost 6 years
    Surely, a large seed isn't a guarantee that all 52! possibilities would be produced (which is what this question is specifically about)? As a thought experiment, consider a pathological PRNG that takes an arbitrarily large seed and generates an infinitely long series of zeroes. It seems pretty clear that the PRNG needs to satisfy more requirements than just taking a large enough seed.
  • Sergey Kalinichenko
    Sergey Kalinichenko almost 6 years
    @SerjArdovic Yes, any seed material passed to a SecureRandom object must be unpredictable, as per Java documentation.
  • Sergei Emelianov
    Sergei Emelianov almost 6 years
    @dasblinkenlight - Now I have to answer another question: how many bytes, i.e. what should be the length of the byte[] to make sure that I can eventually generate every single possibility of the 52!
  • Sergey Kalinichenko
    Sergey Kalinichenko almost 6 years
    @NPE You are right, although a too-small seed is a guarantee of the upper limit, a large enough seed is not a guaranteed on the lower limit. All this does is removing of a theoretical upper limit, making it possible for the RNG to generate all 52! combinations.
  • Sergey Kalinichenko
    Sergey Kalinichenko almost 6 years
    @SerjArdovic The smallest number of bytes required for that is 29 (you need 226 bits to represent 52! possible bit combinations, which is 28.25 bytes, so we must round it up). Note that using 29 bytes of seed material removes the theoretical upper limit on the number of shuffles you could get, without establishing the lower limit (see NPE's comment about a crappy RNG that takes a very large seed and generates a sequence of all zeros).
  • Sergei Emelianov
    Sergei Emelianov almost 6 years
    @dasblinkenlight - What do you mean by saying 'without establishing the lower limit'? Isn't it going to provide me with all 52! possibilities all the way from 0 to maximum?
  • Stephen S
    Stephen S almost 6 years
    @SerjArdovic that's not how RNGs work, they are formulas that generate random numbers from one number (the seed), they don't iterate through every possible sequence of numbers
  • Peter O.
    Peter O. almost 6 years
    The SecureRandom implementation will almost surely use an underlying PRNG. And it depends on that PRNG's period (and to a lesser extent, state length) whether it's capable of choosing from among 52 factorial permutations. (Note that the documentation says the SecureRandom implementation "minimally complies with" certain statistical tests and generates outputs that "must be cryptographically strong", but places no explicit lower limit on the underlying PRNG's state length or on its period.)
  • pjs
    pjs almost 6 years
    @SerjArdovic The number of permutations available to you is a function of the size of the PRNG's state space, not its seed value. The seed is just an entry or starting point into the state space. See the second paragraph of this answer to another post.
  • T.J. Crowder
    T.J. Crowder almost 6 years
    Heh, and I was sure the link at the end would be New Math. :-)
  • NPE
    NPE almost 6 years
    @T.J.Crowder: It very nearly was! It was the infinitely differentiable Riemannian manifolds that swung it. :-)
  • T.J. Crowder
    T.J. Crowder almost 6 years
    Good to see people appreciating the classics. :-)
  • vpalmu
    vpalmu almost 6 years
    A new enough SecureRandom should have at least 256 bits of internal state as the 128 bit hashes are all broken now.
  • Thorsten S.
    Thorsten S. almost 6 years
    Wrong, wrong, wrong. If you didn't know which initializer was which, could you write any kind of test to distinguish them? Yes, you can. In the 64 bit case there are possible card combinations which can never come up. I think you did not understand the magnitude of the problem. The OP wants a working system, not something which looks like random and is inherently flawed.
  • Thorsten S.
    Thorsten S. almost 6 years
    Where do you get the random 226 bits in Java? Sorry, your code does not answer that.
  • Matt Timmermans
    Matt Timmermans almost 6 years
    @ThorstenS, how could you write any kind of test that could determine that there are card combinations that can never come up?
  • Thorsten S.
    Thorsten S. almost 6 years
    There are several random number test suites like Diehard from George Marsaglia or TestU01 from Pierre L’Ecuyer/Richard Simard which easily find statistical anomalies in random output. For card checking you can use two squares. You determine the card order. The first square shows the position of the first two cards as xy pair: The first card as x and the difference(!) position (-26-25) of the second card as y. The second square shows 3rd and 4th card with (-25-25) relative to the 2nd/3rd. This will show immediately gaps and clusters in your distribution if you run it for a time.
  • Matt Timmermans
    Matt Timmermans almost 6 years
    Well, that's not the test you said you could write, but it also doesn't apply. Why do you assume that there are gaps and clusters in the distribution that such tests would uncover? That would imply a "specific weakness in the PRNG implementation" as I mentioned, and has nothing at all to do with the number of possible seeds. Such tests don't even require you to reseed the generator. I did warn at the start that this was hard to understand.
  • Stop harming Monica
    Stop harming Monica almost 6 years
    @ThorstenS. "There is already an implicit assumption in your question that you can somehow generate 64 bits, so just do whatever you were going to do, only four times"
  • Thorsten S.
    Thorsten S. almost 6 years
    @PeterO. See my answer.
  • Thorsten S.
    Thorsten S. almost 6 years
    @Goyo Does not work if you have not a "true" random number generator. The problem is that the next state is dependent on the former state, the output is determined. Some PRNGs like the Mersenne Twister have not this problem because they have 624 independent elements, but they also need a truly random seed. Java Random() has effectively always 48bits, it does not matter how many bits you are extracting. You can extract thousand of bits from the sequence 12345678... or 23456789..., but the effective information is only the bitlength of 1 or 2.
  • Stop harming Monica
    Stop harming Monica almost 6 years
    I don't understand what you mean, Java Random() won't provide 64 bits of entropy either. The OP implies an unspecified source that can produce 64 bits to seed the PRNG. It makes sense to assume that you can ask the same source for 226 bits.
  • Draco18s no longer trusts SE
    Draco18s no longer trusts SE almost 6 years
    Thought: what if you shuffled the deck using a new Random() and then shuffled the result again using a second new Random()? Would that contain a sufficient number of random bits?
  • Sneftel
    Sneftel almost 6 years
    @ThorstenS. Those test suites absolutely will not determine whether your source is a 64-bit-seeded cryptographically secure PRNG or a true RNG. (Testing PRNGs is what those suites are for, after all.) Even if you knew the algorithm in use, a good PRNG makes it infeasible to determine the state without a brute force search of the state space.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE almost 6 years
    @ThorstenS.: In a real deck of cards, the vast majority of combinations will never come up. You just don't know which ones those are. For a half-decent PRNG it's the same - if you can test whether a given output sequence that long is in its image, that's a flaw in the PRNG. Ridiculously huge state/period like 52! is not needed; 128-bit should suffice.
  • Thorsten S.
    Thorsten S. almost 6 years
    @MattTimmermans I wrote a small program to demonstrate what I am talking about, the only problem is that Random() of Java (which implements an linear congruential generator) uses an algorithm of Knuth and it is astonishingly stubborn. My idea of running through all permutations to show that some are not reachable did not work so far for 52*51*50 = 132 000. That sounds not much, but it is a coupon collectior problem meaning 1.5 million iterations are needed. I need to rethink how to demonstrate the problem.
  • Sneftel
    Sneftel almost 6 years
    @ThorstenS. Consider what the difficulty you've experienced tells you about the tractability of what you've suggested is an easy test. Note that with three cards you've still only used up about 17 bits of pure entropy.
  • Matt Timmermans
    Matt Timmermans almost 6 years
    @ThorstenS. I did mention that that particular generator was not very good. Something good with a 64 bit seed would be: Each block of 256 bits returned from the generator is SHA-256(concat("MyRng", seed, N)), where N is the number of blocks returned since the generator was seeded.
  • Thorsten S.
    Thorsten S. almost 6 years
    @Sneftel It does say exactly nothing about that because there are broken RNGs which easily display such traits (see RANDU). Please wait until I can demonstrate the weakness.
  • Sneftel
    Sneftel almost 6 years
    @ThorstenS. If you're trying to demonstrate patterns in a LCG PRNG in generating permutations, I'd suggest that you go in the other direction, with a Fisher-Yates shuffle.
  • GotoFinal
    GotoFinal over 5 years
    Random in java actually only uses 48 bits