Take n random elements from a List<E>?

61,530

Solution 1

Two main ways.

  1. Use Random#nextInt(int):

    List<Foo> list = createItSomehow();
    Random random = new Random();
    Foo foo = list.get(random.nextInt(list.size()));
    

    It's however not guaranteed that successive n calls returns unique elements.

  2. Use Collections#shuffle():

    List<Foo> list = createItSomehow();
    Collections.shuffle(list);
    Foo foo = list.get(0);
    

    It enables you to get n unique elements by an incremented index (assuming that the list itself contains unique elements).


In case you're wondering if there's a Java 8 Stream approach; no, there isn't a built-in one. There's no such thing as Comparator#randomOrder() in standard API (yet?). You could try something like below while still satisfying the strict Comparator contract (although the distribution is pretty terrible):

List<Foo> list = createItSomehow();
int random = new Random().nextInt();
Foo foo = list.stream().sorted(Comparator.comparingInt(o -> System.identityHashCode(o) ^ random)).findFirst().get();

Better use Collections#shuffle() instead.

Solution 2

Most of the proposed solutions till now suggest either a full list shuffle or successive random picking by checking uniqueness and retry if required.

But, we can take advantage of the Durstenfeld's algorithm (the most popular Fisher-Yates variant in our days).

Durstenfeld's solution is to move the "struck" numbers to the end of the list by swapping them with the last unstruck number at each iteration.

Due to the above, we don't need to shuffle the whole list, but run the loop for as many steps as the number of elements required to return. The algorithm ensures that the last N elements at the end of the list are 100% random if we used a perfect random function.

Among the many real-world scenarios where we need to pick a predetermined (max) amount of random elements from arrays/lists, this optimized method is very useful for various card games, such as Texas Poker, where you a-priori know the number of cards to be used per game; only a limited number of cards is usually required from the deck.

public static <E> List<E> pickNRandomElements(List<E> list, int n, Random r) {
    int length = list.size();

    if (length < n) return null;

    //We don't need to shuffle the whole list
    for (int i = length - 1; i >= length - n; --i)
    {
        Collections.swap(list, i , r.nextInt(i + 1));
    }
    return list.subList(length - n, length);
}

public static <E> List<E> pickNRandomElements(List<E> list, int n) {
    return pickNRandomElements(list, n, ThreadLocalRandom.current());
}

Solution 3

If you want to successively pick n elements from the list and be able to do so without replacement over and over and over again, you are probably best of randomly permuting the elements, then taking chunks off in blocks of n. If you randomly permute the list you guarantee statistical randomness for each block you pick out. Perhaps the easiest way to do this would be to use Collections.shuffle.

Solution 4

Simple and clear

   // define ArrayList to hold Integer objects
    ArrayList<Integer> arrayList = new ArrayList<>();

    for (int i = 0; i < maxRange; i++) {
        arrayList.add(i + 1);
    }

    // shuffle list
    Collections.shuffle(arrayList);

    // adding defined amount of numbers to target list
    ArrayList<Integer> targetList = new ArrayList<>();
    for (int j = 0; j < amount; j++) {
        targetList.add(arrayList.get(j)); 
    }

    return targetList;

Solution 5

A fair way to do this is to go through the list, on the nth iteration calculating the probability of whether or not to pick the nth element, which is essentially the fraction of the number of items you still need to pick over the number of elements available in the rest of the list. For example:

public static <T> T[] pickSample(T[] population, int nSamplesNeeded, Random r) {
  T[] ret = (T[]) Array.newInstance(population.getClass().getComponentType(),
                                    nSamplesNeeded);
  int nPicked = 0, i = 0, nLeft = population.length;
  while (nSamplesNeeded > 0) {
    int rand = r.nextInt(nLeft);
    if (rand < nSamplesNeeded) {
      ret[nPicked++] = population[i];
      nSamplesNeeded--;
    }
    nLeft--;
    i++;
  }
  return ret;
}

(This code copied from a page I wrote a while ago on picking a random sample from a list.)

Share:
61,530

Related videos on Youtube

templatetypedef
Author by

templatetypedef

I love teaching, learning, and that really great feeling you get when you finally understand something.

Updated on November 27, 2020

Comments

  • templatetypedef
    templatetypedef over 3 years

    How can I take n random elements from an ArrayList<E>? Ideally, I'd like to be able to make successive calls to the take() method to get another x elements, without replacement.

    • Yanick Rochon
      Yanick Rochon over 13 years
      what have you got so far? If you get another x elements, can you pick elements from the previous set again, or must be all different all the time until ALL elements are picked? (Then, what next?)
    • Admin
      Admin over 13 years
      Without replacement. When you have no more left, you should get nothing back.
  • biziclop
    biziclop over 13 years
    And the easiest way to do this is to call java.util.Collections.shuffle()
  • Prateek
    Prateek about 11 years
    I have got a list of 4000 words and I have to get 5 words out of that list each time I press the refresh button, am using the 2nd option of your answer. How much does it guarantee that I will get unique values all the time i.e. what's the probability ?
  • BalusC
    BalusC about 11 years
    @Prateek: If you have a question, press "Ask Question" button. Do not press "Add comment" or "Post Answer" button.
  • Prateek
    Prateek about 11 years
    I know when to use which button, my comment is somewhat related to your already posted answer so I didn't want to create a new thread of if and was looking for a response inline, thanks anyways.
  • David
    David almost 10 years
    I did not see the correlation between arrayList and targetList.
  • Matunos
    Matunos about 9 years
    Keep in mind that Collections.shuffle() uses a version of the Fisher-Yates shuffle algorithm, with an internal instance of Random. The Random class uses a long for its seed value, meaning it can only offer you up to 2^32 possible permutations. This is insufficient for shuffling any more than 12 elements with uniform probability of all permutations (that is, some permutations will never come up). You'll want to use Collections.shuffle(list,random) instead, where random is either and instance of SecureRandom or your own custom Random extension, if you're up to that task.
  • nomad
    nomad about 9 years
    It should've been targetList.add(arrayList.get(j))
  • Neil Coffey
    Neil Coffey over 8 years
    Matunos - for what it's worth, the effective seed size of java.util.Random is 2^48, but as you say, it's still worth bearing in mind that you may need to select a better generator. I would still advocate the method I mention of simply picking the items with the relevant probability (you still need the same number of random numbers as a shuffle, but you don't have to swap all the pointers, potentially better memory locality, and there's a chance of terminating the loop "early" once you have selected all of the required elements).
  • Zavael
    Zavael over 7 years
    could you provide another sollution using Stream api and findAny()? it would make the answer up-to-date :)
  • BalusC
    BalusC over 7 years
    @Zavael: findAny() is not suitable here. Using Stream would only add extra code because it doesn't have a builtin random based comparator.
  • Zavael
    Zavael over 7 years
    @BalusC yep thats right, I realized it while trying to implement
  • Matt
    Matt over 6 years
    Thanks for pointing this out. I have a situation where I need to remove a small number of elements from a large list, and I was sure shuffling the whole list wasn't the best way to do it, but I was getting hung up on how to remove multiple elements from a list in one fell swoop. Swapping them to the end of the list, and then truncating it, is an elegant solution.
  • Drew O'Meara
    Drew O'Meara almost 4 years
    bravo -- this should be the awarded answer as it is most modular and portable
  • Justin Cranford
    Justin Cranford over 2 years
    It looks like each call to nextRandomN() removes elements from the original list. Any additional calls to nextRandomN() will return an empty list.