Best way to randomize an array with .NET

183,720

Solution 1

If you're on .NET 3.5, you can use the following IEnumerable coolness:

Random rnd=new Random();
string[] MyRandomArray = MyArray.OrderBy(x => rnd.Next()).ToArray();    

Edit: and here's the corresponding VB.NET code:

Dim rnd As New System.Random
Dim MyRandomArray = MyArray.OrderBy(Function() rnd.Next()).ToArray()

Second edit, in response to remarks that System.Random "isn't threadsafe" and "only suitable for toy apps" due to returning a time-based sequence: as used in my example, Random() is perfectly thread-safe, unless you're allowing the routine in which you randomize the array to be re-entered, in which case you'll need something like lock (MyRandomArray) anyway in order not to corrupt your data, which will protect rnd as well.

Also, it should be well-understood that System.Random as a source of entropy isn't very strong. As noted in the MSDN documentation, you should use something derived from System.Security.Cryptography.RandomNumberGenerator if you're doing anything security-related. For example:

using System.Security.Cryptography;

...

RNGCryptoServiceProvider rnd = new RNGCryptoServiceProvider();
string[] MyRandomArray = MyArray.OrderBy(x => GetNextInt32(rnd)).ToArray();

...

static int GetNextInt32(RNGCryptoServiceProvider rnd)
    {
        byte[] randomInt = new byte[4];
        rnd.GetBytes(randomInt);
        return Convert.ToInt32(randomInt[0]);
    }

Solution 2

The following implementation uses the Fisher-Yates algorithm AKA the Knuth Shuffle. It runs in O(n) time and shuffles in place, so is better performing than the 'sort by random' technique, although it is more lines of code. See here for some comparative performance measurements. I have used System.Random, which is fine for non-cryptographic purposes.*

static class RandomExtensions
{
    public static void Shuffle<T> (this Random rng, T[] array)
    {
        int n = array.Length;
        while (n > 1) 
        {
            int k = rng.Next(n--);
            T temp = array[n];
            array[n] = array[k];
            array[k] = temp;
        }
    }
}

Usage:

var array = new int[] {1, 2, 3, 4};
var rng = new Random();
rng.Shuffle(array);
rng.Shuffle(array); // different order from first call to Shuffle

* For longer arrays, in order to make the (extremely large) number of permutations equally probable it would be necessary to run a pseudo-random number generator (PRNG) through many iterations for each swap to produce enough entropy. For a 500-element array only a very small fraction of the possible 500! permutations will be possible to obtain using a PRNG. Nevertheless, the Fisher-Yates algorithm is unbiased and therefore the shuffle will be as good as the RNG you use.

Solution 3

You're looking for a shuffling algorithm, right?

Okay, there are two ways to do this: the clever-but-people-always-seem-to-misunderstand-it-and-get-it-wrong-so-maybe-its-not-that-clever-after-all way, and the dumb-as-rocks-but-who-cares-because-it-works way.

Dumb way

  • Create a duplicate of your first array, but tag each string should with a random number.
  • Sort the duplicate array with respect to the random number.

This algorithm works well, but make sure that your random number generator is unlikely to tag two strings with the same number. Because of the so-called Birthday Paradox, this happens more often than you might expect. Its time complexity is O(n log n).

Clever way

I'll describe this as a recursive algorithm:

To shuffle an array of size n (indices in the range [0..n-1]):

if n = 0
  • do nothing
if n > 0
  • (recursive step) shuffle the first n-1 elements of the array
  • choose a random index, x, in the range [0..n-1]
  • swap the element at index n-1 with the element at index x

The iterative equivalent is to walk an iterator through the array, swapping with random elements as you go along, but notice that you cannot swap with an element after the one that the iterator points to. This is a very common mistake, and leads to a biased shuffle.

Time complexity is O(n).

Solution 4

This algorithm is simple but not efficient, O(N2). All the "order by" algorithms are typically O(N log N). It probably doesn't make a difference below hundreds of thousands of elements but it would for large lists.

var stringlist = ... // add your values to stringlist

var r = new Random();

var res = new List<string>(stringlist.Count);

while (stringlist.Count >0)
{
   var i = r.Next(stringlist.Count);
   res.Add(stringlist[i]);
   stringlist.RemoveAt(i);
}

The reason why it's O(N2) is subtle: List.RemoveAt() is a O(N) operation unless you remove in order from the end.

Solution 5

You can also make an extention method out of Matt Howells. Example.

   namespace System
    {
        public static class MSSystemExtenstions
        {
            private static Random rng = new Random();
            public static void Shuffle<T>(this T[] array)
            {
                rng = new Random();
                int n = array.Length;
                while (n > 1)
                {
                    int k = rng.Next(n);
                    n--;
                    T temp = array[n];
                    array[n] = array[k];
                    array[k] = temp;
                }
            }
        }
    }

Then you can just use it like:

        string[] names = new string[] {
                "Aaron Moline1", 
                "Aaron Moline2", 
                "Aaron Moline3", 
                "Aaron Moline4", 
                "Aaron Moline5", 
                "Aaron Moline6", 
                "Aaron Moline7", 
                "Aaron Moline8", 
                "Aaron Moline9", 
            };
        names.Shuffle<string>();
Share:
183,720

Related videos on Youtube

Mats
Author by

Mats

Updated on February 04, 2022

Comments

  • Mats
    Mats over 2 years

    What is the best way to randomize an array of strings with .NET? My array contains about 500 strings and I'd like to create a new Array with the same strings but in a random order.

    Please include a C# example in your answer.

    • Ian Campbell
      Ian Campbell almost 10 years
      Here's an odd but simple solution for this -- stackoverflow.com/a/4262134/1298685 .
    • ChaseMedallion
      ChaseMedallion about 8 years
      Using the MedallionRandom NuGet package, this is just myArray.Shuffled().ToArray() (or myArray.Shuffle() if you want to mutate the current array)
    • Herohtar
      Herohtar almost 4 years
      Duplicate of Randomize a List<T>
  • Pitarou
    Pitarou over 15 years
    The shuffle algorithm is broken. You would have to make your arbitrary 5 very high indeed before your shuffle is unbiased.
  • Nick Johnson
    Nick Johnson over 15 years
    This has the same effect as a knuth shuffle, but it's not as efficient, since it involves depopulating one list and repopulating another. Swapping items in place would be a better solution.
  • Sklivvz
    Sklivvz over 15 years
    I find this elegant and easily understandable and on 500 strings it doesn't make a bit of a difference...
  • therealhoff
    therealhoff over 15 years
    two notes: 1) System.Random is not thread-safe (you've been warned) and 2) System.Random is time based, so if you use this code in a heavily concurrent system, it's possible for two requests to get the same value (i.e. in webapps)
  • therealhoff
    therealhoff over 15 years
    Just to clarify the above, System.Random will seed itself using the current time, so two instances created simultaneously will generate the same "random" sequence..System.Random should only be used in toy apps
  • Matt Howells
    Matt Howells over 15 years
    Also this algorithm is O(n log n) and biased by the Qsort algorithm. See my answer for an O(n) unbiased solution.
  • Sam Harwell
    Sam Harwell over 14 years
    Unless OrderBy caches the sort keys internally, this also has the problem of violating the transitive property of ordered comparisons. If there is ever a debug mode verification that OrderBy produced correct results, then in theory it could throw an exception.
  • Sam Harwell
    Sam Harwell over 14 years
    Edit: OrderBy does in fact cache the keys. However, I don't see this as part of the public specification so there's no guarantee that all implementations will do so.
  • Gregor Slavec
    Gregor Slavec over 13 years
  • T_D
    T_D about 8 years
    To me it feels like you could increase both efficiency and readablilty by instead of trying to shuffle an Array by declaring a second Array, you better try to convert to a List, Shuffle and back to an Array: sortedList = source.ToList().OrderBy(x => generator.Next()).ToArray();
  • usefulBee
    usefulBee about 8 years
    Error: Operator '.' cannot be applied to operand of type 'void'
  • Christopher
    Christopher over 6 years
    Make a Array of the (integer) indexes. Shuffle the indexes. Just use the indexes in that random order. No duplicates, no shuffling around of string references in memory (wich may each trigger interning and what not).
  • Wai Ha Lee
    Wai Ha Lee over 5 years
    Your links are still broken :/
  • Ken Kin
    Ken Kin about 5 years
    Wouldn't it be better to change the parameters and make the usage like array.Shuffle(new Random()); .. ?
  • dynamichael
    dynamichael almost 5 years
    You can simplify the swap using Tuples as of framework 4.0 -> (array[n], array[k]) = (array[k], array[n]);
  • Matt Howells
    Matt Howells almost 5 years
    @Ken Kin: No, this would be bad. The reason is that new Random() is initialized with a seed value based on the current system time, which only updates every ~16ms.
  • galamdring
    galamdring over 4 years
    In some quick tests of this vs the list removeAt solution, there is a small difference at 999 elements. The difference gets drastic at 99999 random ints, with this solution at 3ms and the other at 1810ms.
  • Yaron
    Yaron over 4 years
    why are you re-creating rng every call to the method... You declare it at the class level but use it as a local...