Fastest way to check if an array is sorted

33,541

Solution 1

You will have to visit each element of the array to see if anything is unsorted.

Your O(n) approach is about as fast as it gets, without any special knowledge about the likely state of the array.

Your code specifically tests if the array is sorted with smaller values at lower indices. If that is not what you intend, your if becomes slightly more complex. Your code comment does suggest that is what you're after.

If you were to have special knowledge of the probable state (say, you know it's generally sorted but new data might be added to the end), you can optimize the order in which you visit array elements to allow the test to fail faster when the array is unsorted.

You can leverage knowledge of the hardware architecture to check multiple parts of the array in parallel by partitioning the array, first comparing the boundaries of the partition (fail fast check) and then running one array partition per core on a separate thread (no more than 1 thread per CPU core). Note though that if a array partition is much smaller than the size of a cache line, the threads will tend to compete with each other for access to the memory containing the array. Multithreading will only be very efficient for fairly large arrays.

Solution 2

Faster approach, platform target: Any CPU, Prefer 32-bit.
A sorted array with 512 elements: ~25% faster.

static bool isSorted(int[] a)
{
    int j = a.Length - 1;
    if (j < 1) return true;
    int ai = a[0], i = 1;
    while (i <= j && ai <= (ai = a[i])) i++;
    return i > j;
}

Target: x64, same array: ~40% faster.

static bool isSorted(int[] a)
{
    int i = a.Length - 1;
    if (i <= 0) return true;
    if ((i & 1) > 0) { if (a[i] < a[i - 1]) return false; i--; }
    for (int ai = a[i]; i > 0; i -= 2)
        if (ai < (ai = a[i - 1]) || ai < (ai = a[i - 2])) return false;
    return a[0] <= a[1];
}

Forgot one, marginally slower than my first code block.

static bool isSorted(int[] a)
{
    int i = a.Length - 1; if (i < 1) return true;
    int ai = a[i--]; while (i >= 0 && ai >= (ai = a[i])) i--;
    return i < 0;
}

Measuring it (see greybeard's comment).

using System;                                  //  ????????? DEBUG ?????????
using sw = System.Diagnostics.Stopwatch;       //  static bool abc()    
class Program                                  //  {   // a <= b <= c ?  
{                                              //      int a=4,b=7,c=9;  
    static void Main()                         //      int i = 1;  
    {                                          //      if (a <= (a = b))  
        //abc();                               //      {  
        int i = 512;                           //          i++;  
        int[] a = new int[i--];                //          if (a <= (a = c))
        while (i > 0) a[i] = i--;              //          {    
        sw sw = sw.StartNew();                 //              i++;  
        for (i = 10000000; i > 0; i--)         //          }  
            isSorted(a);                       //      }  
        sw.Stop();                             //      return i > 2;  
        Console.Write(sw.ElapsedMilliseconds); //  }  
        Console.Read();                        //  static bool ABC();
    }                                          //  {
                                               //      int[]a={4,7,9};    
    static bool isSorted(int[] a) // OP Cannon //      int i=1,j=2,ai=a[0]; 
    {                                          //  L0: if(i<=j)    
        for (int i = 1; i < a.Length; i++)     //        if(ai<=(ai=a[i]))  
            if (a[i - 1] > a[i]) return false; //          {i++;goto L0;}  
        return true;                           //      return i > j;  
    }                                          //  }  
}

Target: x64. Four cores/threads. A sorted array with 100,000 elements: ~55%.

static readonly object _locker = new object();
static bool isSorted(int[] a)  // a.Length > 3
{
    bool b = true;
    Parallel.For(0, 4, k =>
    {
        int i = 0, j = a.Length, ai = 0;
        if (k == 0) { j /= 4; ai = a[0]; }                        // 0 1
        if (k == 1) { j /= 2; i = j / 2; ai = a[i]; }             // 1 2
        if (k == 2) { i = j - 1; ai = a[i]; j = j / 2 + j / 4; }  // 4 3
        if (k == 3) { i = j - j / 4; ai = a[i]; j = j / 2; }      // 3 2
        if (k < 2)
            while (b && i <= j)
            {
                if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2])) i += 2;
                else lock (_locker) b = false;
            }
        else
            while (b && i >= j)
            {
                if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2])) i -= 2;
                else lock (_locker) b = false;
            }
    });
    return b;
}

1,000,000 items?

if (k < 2)
    while (b && i < j)
        if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2]) &&
            ai <= (ai = a[i + 3]) && ai <= (ai = a[i + 4])) i += 4;
        else lock (_locker) b = false;
else
    while (b && i > j)
        if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2]) &&
            ai >= (ai = a[i - 3]) && ai >= (ai = a[i - 4])) i -= 4;
        else lock (_locker) b = false;

Let's forget percentages.
Original: 0.77 ns/item, now: 0.22 ns/item.
2,000,000 items? Four cores: 4 times faster.

Solution 3

Linq solution.

public static bool IsSorted<T>(IEnumerable<T> list) where T:IComparable<T>
{
    var y = list.First();
    return list.Skip(1).All(x =>
    {
        bool b = y.CompareTo(x) < 0;
        y = x;
        return b;
    });
}

Solution 4

Here is my version of the function IsSorted

public static bool IsSorted(int[] arr)
{               
    int last = arr.Length - 1;
    if (last < 1) return true;

    int i = 0;

    while(i < last && arr[i] <= arr[i + 1])
        i++;

    return i == last;
}

While this function is a bit faster than in the question, it will do fewer assignments and comparisons than anything has been posted so far. In the worst case, it does 2n+1 comparisons. It still can be improved if you can make a reasonable assumption about the nature of the data like minimum data size or array contains even number of elements.

Share:
33,541
Cannon
Author by

Cannon

Developer

Updated on July 10, 2020

Comments

  • Cannon
    Cannon almost 4 years

    Considering there is an array returned from a function which is of very large size.

    What will be the fastest approach to test if the array is sorted?

    A simplest approach will be:

    /// <summary>
    /// Determines if int array is sorted from 0 -> Max
    /// </summary>
    public static bool IsSorted(int[] arr)
    {
    for (int i = 1; i < arr.Length; i++)
    {
        if (arr[i - 1] > arr[i])
        {
        return false;
        }
    }
    return true;
    }
    
  • Senthil Babu
    Senthil Babu over 11 years
    How do you say that this algorithm can execute in half time? Effective you are doing the same amount of comparisons (l/2 * 2 i.e. l).
  • saul672
    saul672 over 11 years
    ok, i see your point, anyway if there's something wrong it might find it sooner, this is when the error is in the first or fourth quarter of the array
  • Eric J.
    Eric J. over 11 years
    Then again, you will find it later if the unsorted portion is in the middle. Adds complexity without adding value. Plus checking from both ends reduces the likelihood that the data you are accessing is in the CPU cache (depending on the size of the array and the underlying architecture).
  • Andrej Bauer
    Andrej Bauer over 11 years
    Not to mention that this will likely be slower because of cache behavior.
  • Eric J.
    Eric J. over 11 years
    Innately sorted data structures still incur a cost to sort them. It's just spread out between the various Add/Insert/Delete calls. That may or may not be a good thing, depending on use case.
  • user1277476
    user1277476 over 11 years
    I would say that other than initial population, you don't sort a treap. Updating a treap inside a loop is not sorting - it's preserving sortedness, which is less expensive than even Timsort.
  • Eric J.
    Eric J. over 8 years
    The OP asked What will be the fastest approach to test if the array is sorted?. This solution is at best as fast as the OP's solution, and is probably a little slower. Linq may be the new black, but it's not the best solution all of the time.
  • greybeard
    greybeard almost 8 years
    Please take a stab at the asymptotic order of growth of the number of comparisons astot grows large. Would there be any chance to capitalise on order relations being transitive?
  • atomCode
    atomCode almost 8 years
    I agree that the number of comparisons grows large when tot is large however, we can't assume the sequence to be transitive because we don't know what the sequence order is and if it was transitive (by definition) it is assumed to already be in some type order.
  • greybeard
    greybeard over 7 years
    Does this detect inversions? (I think 1) it does 2) it isn't obvious.) Please disclose your approach to measuring this.
  • P_P
    P_P over 7 years
    Does this detect inversions? Good question! Wikipedia: Let (A(1),...,A(n)) be a sequence of n distinct numbers. If i < j and A(i) > A(j), then the pair (i,j) is an inversion of A. - If the sequence has duplicates, then not all numbers are distinct. -If a pair (i,j), isSorted uses j = i + 1 , and A(i) > A(j), then the sequence is not sorted, and, YES, an inversion is detected in a sequence of not necessarily distinct numbers, we're done. So it does NOT detect inversionS, but if it detects ONE inversion, then we know the sequence is NOT sorted. I added a code block "Measuring it".
  • greybeard
    greybeard over 7 years
    (Do not comment comments asking for additional information or clarification: edit your post. The part that made me comment/I don't (yet) find idiomatic: ai <= (ai = a[i]) (and similar).)
  • Jim Mischel
    Jim Mischel over 7 years
    In fact, this could fail if the passed Enumerable is an iterate-once collection. For example, if somebody were to write x = IsSorted(File.ReadLines("filename.txt")); the method would fail because it would require multiple enumeration of the Enumerable.
  • Jim Mischel
    Jim Mischel over 7 years
    Are you sure those timings were taken with a Release build with the debugger NOT attached? If the debugger is attached (i.e. you pressed F5 in Visual Studio), then your timings do not reflect how the code will run in production. Select "Run without Debugging" from the Debug menu if you want real timings.
  • Jim Mischel
    Jim Mischel over 7 years
    If you can't assume that the sequence order is transitive, then it's not possible to sort in the first place. Sorting in the general sense implies transitivity.
  • P_P
    P_P over 7 years
    Yes I am sure. My timings were taken with a Release build (Optimize code "on"). I just checked it on a slower machine, a laptop. Results for a sorted array with 512 elements: Any CPU, prefer 32-bit: 24%, x64: 35%. And to put it simply: run it on your system, check whether my approach is faster (or not).
  • greybeard
    greybeard over 7 years
    This is what I […] find works better a) better than what? (Better than the code presented in the question?) b) better in which regard: how can I reproduce your finding?
  • P_P
    P_P over 7 years
    Sorry Michael, but 1)Did you test your solution, int[] arr = new int[]{0,0}, isSorted(arr,0) returns false (change "<" into "<="). 2)On my box it's (much) slower than OP's solution. 3)With a large array ("int[] arr = new int[1000000]"): stack overflow.
  • Michael Dera
    Michael Dera over 7 years
    Hi @greybeard, yes I was referring to the code presented in the question, this in regard to how execution time for larger arrays. I tried it with samples of random data. If you find this is otherwise please do let me know
  • Michael Dera
    Michael Dera over 7 years
    @P_P Thank you for that, I had somehow overlooked cases where neighbouring elements are equal. I had tested the solution and found it was faster for larger data sets, however I will do further tests on it
  • Steves
    Steves almost 7 years
    You don't need the lock, volatile or Thread.MemoryBarrier would suffice. In fact, reading true "after" some thread already set false is not going to make the algorithm give incorrect result and is maybe good trade-off for not requiring specific ordering.
  • Natalie Perret
    Natalie Perret about 6 years
    @AndrejBauer can you elaborate about the cache behaviour?
  • Andrej Bauer
    Andrej Bauer about 6 years
    Well, I am no expert on CPU cache's, but in general fetching memory in nice orderly fashion is better because the CPU has a lot of heuristics for guessing which pieces of memory should be prefetched. Fetching memory linearly is likely going to have better behavior. In any case, it can't be to difficult to make some experiments. (Not by me, I have no stake in this.)