Fastest way to check if an array is sorted
Solution 1
You will have to visit each element of the array to see if anything is unsorted.
Your O(n) approach is about as fast as it gets, without any special knowledge about the likely state of the array.
Your code specifically tests if the array is sorted with smaller values at lower indices. If that is not what you intend, your if becomes slightly more complex. Your code comment does suggest that is what you're after.
If you were to have special knowledge of the probable state (say, you know it's generally sorted but new data might be added to the end), you can optimize the order in which you visit array elements to allow the test to fail faster when the array is unsorted.
You can leverage knowledge of the hardware architecture to check multiple parts of the array in parallel by partitioning the array, first comparing the boundaries of the partition (fail fast check) and then running one array partition per core on a separate thread (no more than 1 thread per CPU core). Note though that if a array partition is much smaller than the size of a cache line, the threads will tend to compete with each other for access to the memory containing the array. Multithreading will only be very efficient for fairly large arrays.
Solution 2
Faster approach, platform target: Any CPU, Prefer 32-bit.
A sorted array with 512 elements: ~25% faster.
static bool isSorted(int[] a)
{
int j = a.Length - 1;
if (j < 1) return true;
int ai = a[0], i = 1;
while (i <= j && ai <= (ai = a[i])) i++;
return i > j;
}
Target: x64, same array: ~40% faster.
static bool isSorted(int[] a)
{
int i = a.Length - 1;
if (i <= 0) return true;
if ((i & 1) > 0) { if (a[i] < a[i - 1]) return false; i--; }
for (int ai = a[i]; i > 0; i -= 2)
if (ai < (ai = a[i - 1]) || ai < (ai = a[i - 2])) return false;
return a[0] <= a[1];
}
Forgot one, marginally slower than my first code block.
static bool isSorted(int[] a)
{
int i = a.Length - 1; if (i < 1) return true;
int ai = a[i--]; while (i >= 0 && ai >= (ai = a[i])) i--;
return i < 0;
}
Measuring it (see greybeard's comment).
using System; // ????????? DEBUG ?????????
using sw = System.Diagnostics.Stopwatch; // static bool abc()
class Program // { // a <= b <= c ?
{ // int a=4,b=7,c=9;
static void Main() // int i = 1;
{ // if (a <= (a = b))
//abc(); // {
int i = 512; // i++;
int[] a = new int[i--]; // if (a <= (a = c))
while (i > 0) a[i] = i--; // {
sw sw = sw.StartNew(); // i++;
for (i = 10000000; i > 0; i--) // }
isSorted(a); // }
sw.Stop(); // return i > 2;
Console.Write(sw.ElapsedMilliseconds); // }
Console.Read(); // static bool ABC();
} // {
// int[]a={4,7,9};
static bool isSorted(int[] a) // OP Cannon // int i=1,j=2,ai=a[0];
{ // L0: if(i<=j)
for (int i = 1; i < a.Length; i++) // if(ai<=(ai=a[i]))
if (a[i - 1] > a[i]) return false; // {i++;goto L0;}
return true; // return i > j;
} // }
}
Target: x64. Four cores/threads. A sorted array with 100,000 elements: ~55%.
static readonly object _locker = new object();
static bool isSorted(int[] a) // a.Length > 3
{
bool b = true;
Parallel.For(0, 4, k =>
{
int i = 0, j = a.Length, ai = 0;
if (k == 0) { j /= 4; ai = a[0]; } // 0 1
if (k == 1) { j /= 2; i = j / 2; ai = a[i]; } // 1 2
if (k == 2) { i = j - 1; ai = a[i]; j = j / 2 + j / 4; } // 4 3
if (k == 3) { i = j - j / 4; ai = a[i]; j = j / 2; } // 3 2
if (k < 2)
while (b && i <= j)
{
if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2])) i += 2;
else lock (_locker) b = false;
}
else
while (b && i >= j)
{
if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2])) i -= 2;
else lock (_locker) b = false;
}
});
return b;
}
1,000,000 items?
if (k < 2)
while (b && i < j)
if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2]) &&
ai <= (ai = a[i + 3]) && ai <= (ai = a[i + 4])) i += 4;
else lock (_locker) b = false;
else
while (b && i > j)
if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2]) &&
ai >= (ai = a[i - 3]) && ai >= (ai = a[i - 4])) i -= 4;
else lock (_locker) b = false;
Let's forget percentages.
Original: 0.77 ns/item, now: 0.22 ns/item.
2,000,000 items? Four cores: 4 times faster.
Solution 3
Linq solution.
public static bool IsSorted<T>(IEnumerable<T> list) where T:IComparable<T>
{
var y = list.First();
return list.Skip(1).All(x =>
{
bool b = y.CompareTo(x) < 0;
y = x;
return b;
});
}
Solution 4
Here is my version of the function IsSorted
public static bool IsSorted(int[] arr)
{
int last = arr.Length - 1;
if (last < 1) return true;
int i = 0;
while(i < last && arr[i] <= arr[i + 1])
i++;
return i == last;
}
While this function is a bit faster than in the question, it will do fewer assignments and comparisons than anything has been posted so far. In the worst case, it does 2n+1 comparisons. It still can be improved if you can make a reasonable assumption about the nature of the data like minimum data size or array contains even number of elements.
Comments
-
Cannon almost 4 years
Considering there is an array returned from a function which is of very large size.
What will be the
fastest
approach to test if the array is sorted?A simplest approach will be:
/// <summary> /// Determines if int array is sorted from 0 -> Max /// </summary> public static bool IsSorted(int[] arr) { for (int i = 1; i < arr.Length; i++) { if (arr[i - 1] > arr[i]) { return false; } } return true; }
-
Senthil Babu over 11 yearsHow do you say that this algorithm can execute in half time? Effective you are doing the same amount of comparisons (l/2 * 2 i.e. l).
-
saul672 over 11 yearsok, i see your point, anyway if there's something wrong it might find it sooner, this is when the error is in the first or fourth quarter of the array
-
Eric J. over 11 yearsThen again, you will find it later if the unsorted portion is in the middle. Adds complexity without adding value. Plus checking from both ends reduces the likelihood that the data you are accessing is in the CPU cache (depending on the size of the array and the underlying architecture).
-
Andrej Bauer over 11 yearsNot to mention that this will likely be slower because of cache behavior.
-
Eric J. over 11 yearsInnately sorted data structures still incur a cost to sort them. It's just spread out between the various Add/Insert/Delete calls. That may or may not be a good thing, depending on use case.
-
user1277476 over 11 yearsI would say that other than initial population, you don't sort a treap. Updating a treap inside a loop is not sorting - it's preserving sortedness, which is less expensive than even Timsort.
-
Eric J. over 8 yearsThe OP asked
What will be the fastest approach to test if the array is sorted?
. This solution is at best as fast as the OP's solution, and is probably a little slower. Linq may be the new black, but it's not the best solution all of the time. -
greybeard almost 8 yearsPlease take a stab at the asymptotic order of growth of the number of comparisons as
tot
grows large. Would there be any chance to capitalise on order relations being transitive? -
atomCode almost 8 yearsI agree that the number of comparisons grows large when tot is large however, we can't assume the sequence to be transitive because we don't know what the sequence order is and if it was transitive (by definition) it is assumed to already be in some type order.
-
greybeard over 7 yearsDoes this detect inversions? (I think 1) it does 2) it isn't obvious.) Please disclose your approach to measuring this.
-
P_P over 7 yearsDoes this detect inversions? Good question! Wikipedia: Let (A(1),...,A(n)) be a sequence of n distinct numbers. If i < j and A(i) > A(j), then the pair (i,j) is an inversion of A. - If the sequence has duplicates, then not all numbers are distinct. -If a pair (i,j), isSorted uses j = i + 1 , and A(i) > A(j), then the sequence is not sorted, and, YES, an inversion is detected in a sequence of not necessarily distinct numbers, we're done. So it does NOT detect inversionS, but if it detects ONE inversion, then we know the sequence is NOT sorted. I added a code block "Measuring it".
-
greybeard over 7 years(Do not comment comments asking for additional information or clarification: edit your post. The part that made me comment/I don't (yet) find idiomatic:
ai <= (ai = a[i])
(and similar).) -
Jim Mischel over 7 yearsIn fact, this could fail if the passed
Enumerable
is an iterate-once collection. For example, if somebody were to writex = IsSorted(File.ReadLines("filename.txt"));
the method would fail because it would require multiple enumeration of theEnumerable
. -
Jim Mischel over 7 yearsAre you sure those timings were taken with a Release build with the debugger NOT attached? If the debugger is attached (i.e. you pressed F5 in Visual Studio), then your timings do not reflect how the code will run in production. Select "Run without Debugging" from the Debug menu if you want real timings.
-
Jim Mischel over 7 yearsIf you can't assume that the sequence order is transitive, then it's not possible to sort in the first place. Sorting in the general sense implies transitivity.
-
P_P over 7 yearsYes I am sure. My timings were taken with a Release build (Optimize code "on"). I just checked it on a slower machine, a laptop. Results for a sorted array with 512 elements: Any CPU, prefer 32-bit: 24%, x64: 35%. And to put it simply: run it on your system, check whether my approach is faster (or not).
-
greybeard over 7 years
This is what I […] find works better
a) better than what? (Better than the code presented in the question?) b) better in which regard: how can I reproduce your finding? -
P_P over 7 yearsSorry Michael, but 1)Did you test your solution, int[] arr = new int[]{0,0}, isSorted(arr,0) returns false (change "<" into "<="). 2)On my box it's (much) slower than OP's solution. 3)With a large array ("int[] arr = new int[1000000]"): stack overflow.
-
Michael Dera over 7 yearsHi @greybeard, yes I was referring to the code presented in the question, this in regard to how execution time for larger arrays. I tried it with samples of random data. If you find this is otherwise please do let me know
-
Michael Dera over 7 years@P_P Thank you for that, I had somehow overlooked cases where neighbouring elements are equal. I had tested the solution and found it was faster for larger data sets, however I will do further tests on it
-
Steves almost 7 yearsYou don't need the lock,
volatile
orThread.MemoryBarrier
would suffice. In fact, readingtrue
"after" some thread already setfalse
is not going to make the algorithm give incorrect result and is maybe good trade-off for not requiring specific ordering. -
Natalie Perret about 6 years@AndrejBauer can you elaborate about the cache behaviour?
-
Andrej Bauer about 6 yearsWell, I am no expert on CPU cache's, but in general fetching memory in nice orderly fashion is better because the CPU has a lot of heuristics for guessing which pieces of memory should be prefetched. Fetching memory linearly is likely going to have better behavior. In any case, it can't be to difficult to make some experiments. (Not by me, I have no stake in this.)