Choosing random numbers efficiently
Solution 1
You can try using an existing Java implementation (or this one) for a Mersenne Twister.
Keep in mind most MT's are not cryptographically secure.
Solution 2
It looks like you want to select a k-combination from a set S without replacement, with S having n distinct values, k = 5 and n = 52. You can shuffle()
the entire set and select k elements (as @Tesserex suggests), or pick()
k elements while avoiding duplicates (as you've shown). You'll want to profile both in your particular environment and for your chosen generator. I often, but not always, see a modest edge for pick()
.
private static final Random rnd = new Random();
private static final int N = 52;
private static final int K = 5;
private static final List<Integer> S = new ArrayList<Integer>(N);
static {
for (int i = 0; i < N; i++) {
S.add(i + 1);
}
}
private final List<Integer> combination = new ArrayList<Integer>(K);
...
private void shuffle() {
Collections.shuffle(S, rnd);
combination.addAll(S.subList(0, K));
}
private void pick() {
for (int i = 0; i < K; i++) {
int v = 0;
do {
v = rnd.nextInt(N) + 1;
} while (combination.contains(v));
combination.add(v);
}
}
Solution 3
You could use linear congruence as a random generator: http://en.wikipedia.org/wiki/Linear_congruential_generator [yet consider their statistical disadvantages]
You only need a calculation of (x + c) % m for each number. Yet, in my experience the creation of objects (like you might do with every call of new Set and add, depending on which implementation you use) might cost you more speed than a call to nextInt(). Maybe you should try a profiler like e.g. this one: http://www.eclipse.org/tptp/
Solution 4
A common technique is to start with a list of all the possible inputs, and randomly select from that, deleting ones as you go. That way there's no risk of selecting a duplicate and having to loop for an unknown amount of time. Of course this method only works with discrete data, but fortunately integers are. Also remember that your list (or other data structure) selection and deletion should be O(1) if possible, since you're focusing on speed.
Solution 5
I don't have any input on your actual problem, and I don't know too much Java (just poking around). However it seems to me that you are trying to build a hand evaluator for poker and this thread http://pokerai.org/pf3/viewtopic.php?f=3&t=16 contains some extremely fast java hand evaluators. Hopefully some of this code could be of help.
Comments
-
Frederik Wordenskjold about 2 years
I have a method, which uses random samples to approximate a calculation. This method is called millions of times, so its very important that the process of choosing the random numbers is efficient.
I'm not sure how fast javas
Random().nextInt
really are, but my program does not seem to benefit as much as I would like it too.When choosing the random numbers, I do the following (in semi pseudo-code):
// Repeat this 300000 times Set set = new Set(); while(set.length != 5) set.add(randomNumber(MIN,MAX));
Now, this obviously has a bad worst-case running time, as the random-function in theory can add duplicated numbers for an eternity, thus staying in the while-loop forever. However, the numbers are chosen from {0..45}, so a duplicated value is for the most part unlikely.
When I use the above method, its only 40% faster than my other method, which does not approximate, but yields the correct result. This is ran ~ 1 million times, so I was expecting this new method to be at least 50% faster.
Do you have any suggestions for a faster method? Or maybe you know of a more efficient way of generation a set of random numbers.
To clarify, here is the two methods:
// Run through all combinations (1 million). This takes 5 seconds for(int c1 = 0; c1 < deck.length; c1++){ for(int c2 = c1+1; c2 < deck.length; c2++){ for(int c3 = c2+1; c3 < deck.length; c3++){ for(int c4 = c3+1; c4 < deck.length; c4++){ for(int c5 = c4+1; c5 < deck.length; c5++){ enumeration(hands, cards, deck, c1, c2, c3, c4, c5); } } } } } // Approximate (300000 combinations). This takes 3 seconds Random rand = new Random(); HashSet<Integer> set = new HashSet<Integer>(); int[] numbers = new int[5]; while(enumerations < 300000){ set.clear(); while(set.size() != 5){ set.add(rand.nextInt(deck.length)); } Iterator<Integer> i = set.iterator(); int n = 0; while(i.hasNext()){ numbers[n] = i.next(); n++; }
After some testing and profiling, I found this method to be the most effective:
Random rand = new Random(); int[] numbers = new int[5]; ArrayList<Integer> list = new ArrayList<Integer>(); while(enumerations < 300000){ while(list.size() != 5) { int i = rand.nextInt(deck.length); if(!list.contains(i)) list.add(i); } int index = 0; for(int i : list){ numbers[index] = i; index++; } enumeration(hands, cards, deck,numbers); }
-
Jon Skeet over 14 years@Yuval: If you don't measure, you don't know when it's fast enough. Profiling is often invasive. You should measure and profile... although you certainly shouldn't measure a single call like this.
-
Frederik Wordenskjold over 14 yearsHow fast is Math.random() compared with Random().nextInt() ? I'm using the Random-class at the moment.
-
Frederik Wordenskjold over 14 yearsCan you clarify, what you mean by not cryptographically secure?
-
Tesserex over 14 yearsIt means you shouldn't use them for cryptography, because it's still possible to predict the next number given a certain amount of prior information.
-
Frederik Wordenskjold over 14 yearsI'm running os x, so I cannot use the eclipse tptp profiler! I really miss a profiler!
-
Searles over 14 yearsI've once used JProfiler on Mac OS X. Afaik they have a free trial period of 14 days.
-
Frederik Wordenskjold over 14 yearsI've actually been inspired by some algorithms in this thread. I'm implementing an omaha-evaluator though, so many of the stuff in this thread, like heads-up lookup tables, I cannot use.
-
Michael Borgwardt over 14 years@Frederik: Math.random() uses Random.nextDouble() internally, so it's slower, if anything.
-
Saurabh over 14 yearsIf the application is what it looks like (a poker odds calculator) then there are 52C5 == 2598960 possible inputs, so fewer than 1/6 of inputs will be used. This is very inefficient use of memory, since the input samples (in a typical poker odds calculator) do not need to be retained in memory after evaluation. This will be even worse in the likely event that the evaluation function is extended to 7-card hands (52C7 == 133784560 combinations)
-
Tesserex over 14 yearsyes, you are right - unfortunately that additional information with the actual methods was not in the question when i wrote my answer.
-
Frederik Wordenskjold over 14 yearsExactly what I thought - that the probability of a duplicate is so small, that the times it takes to prevent them would take a longer time than just checking for them. I've updated the OP with my result.