Quicksort superiority over Heap Sort

algorithm sorting big-o quicksort heapsort

21,634

Solution 1

One of the major factors is that quicksort has better locality of reference -- the next thing to be accessed is usually close in memory to the thing you just looked at. By contrast, heapsort jumps around significantly more. Since things that are close together will likely be cached together, quicksort tends to be faster.

However, quicksort's worst-case performance is significantly worse than heapsort's is. Because some critical applications will require guarantees of speed performance, heapsort is the right way to go for such cases.

Solution 2

Heapsort is O(N log N) guaranted, what is much better than worst case in Quicksort. Heapsort don't need more memory for another array to putting ordered data as is needed by Mergesort. So why do comercial applications stick with Quicksort? What Quicksort has that is so special over others implementations?

I've tested the algorithms myself and I've seen that Quicksort has something special indeed. It runs fast, much faster than Heap and Merge algorithms.

The secret of Quicksort is: It almost doesn't do unnecessary element swaps. Swap is time consuming.

With Heapsort, even if all of your data is already ordered, you are going to swap 100% of elements to order the array.

With Mergesort, it's even worse. You are going to write 100% of elements in another array and write it back in the original one, even if data is already ordered.

With Quicksort you don't swap what is already ordered. If your data is completely ordered, you swap almost nothing! Although there is a lot of fussing about worst case, a little improvement on the choice of pivot, any other than getting the first or last element of array, can avoid it. If you get a pivot from the intermediate element between first, last and middle element, it is suficient to avoid worst case.

What is superior in Quicksort is not the worst case, but the best case! In best case you do the same number of comparisons, ok, but you swap almost nothing. In average case you swap part of the elements, but not all elements, as in Heapsort and Mergesort. That is what gives Quicksort the best time. Less swap, more speed.

The implementation below in C# on my computer, running on release mode, beats Array.Sort by 3 seconds with middle pivot and by 2 seconds with improved pivot (yes, there is an overhead to get a good pivot).

static void Main(string[] args)
{
    int[] arrToSort = new int[100000000];
    var r = new Random();
    for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);

    Console.WriteLine("Press q to quick sort, s to Array.Sort");
    while (true)
    {
        var k = Console.ReadKey(true);
        if (k.KeyChar == 'q')
        {
            // quick sort
            Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            QuickSort(arrToSort, 0, arrToSort.Length - 1);
            Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
        else if (k.KeyChar == 's')
        {
            Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            Array.Sort(arrToSort);
            Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
    }
}

static public void QuickSort(int[] arr, int left, int right)
{
    int begin = left
        , end = right
        , pivot
        // get middle element pivot
        //= arr[(left + right) / 2]
        ;

    //improved pivot
    int middle = (left + right) / 2;
    int
        LM = arr[left].CompareTo(arr[middle])
        , MR = arr[middle].CompareTo(arr[right])
        , LR = arr[left].CompareTo(arr[right])
        ;
    if (-1 * LM == LR)
        pivot = arr[left];
    else
        if (MR == -1 * LR)
            pivot = arr[right];
        else
            pivot = arr[middle];
    do
    {
        while (arr[left] < pivot) left++;
        while (arr[right] > pivot) right--;

        if(left <= right)
        {
            int temp = arr[right];
            arr[right] = arr[left];
            arr[left] = temp;

            left++;
            right--;
        }
    } while (left <= right);

    if (left < end) QuickSort(arr, left, end);
    if (begin < right) QuickSort(arr, begin, right);
}

Solution 3

Here's a couple explanations:

http://www.cs.auckland.ac.nz/software/AlgAnim/qsort3.html

http://users.aims.ac.za/~mackay/sorting/sorting.html

Essentially, even though the worst case for quick sort is O(n^2) it on average will perform better. :-)

Solution 4

The big-O notation means that the time required to sort n items is bounded above by the function c*n*log(n), where c is some unspecified constant factor. There is no reason why the constant c should be the same for quicksort and heapsort. So the real question is: why would you expect them to be equally fast?

Quicksort has always been somewhat faster than heapsort in practice, but the difference has become larger recently since, as mentioned before, locality of memory access has become so important to execution speed.

View more solutions

21,634

Nitish Upreti

Building and Scaling Microsoft Azure.

Updated on January 07, 2021

Comments

Nitish Upreti over 3 years

Heap Sort has a worst case complexity of O(nlogn) while Quicksort has O(n^2). But emperical evidences say quicksort is superior. Why is that?
- Paul over 14 years
  
  The worst case occurs when the elements are already sorted - a relative rare case - and one that can be easily avoided by doing a simple shuffle first if this use case could occur in your system. Locality of reference is the key for QR's fast runtime performance.
- Manohar Reddy Poreddy almost 4 years
  
  @Paul Simple shuffle won't solve the problem of duplicate values in array for Quicksort.
EvilTeach over 14 years

For small working sets the locality of reference issue is critical to avoiding unwanted page faults. It is a strong argument to end the function with a call to sort the left hand most partition, followed by a tail recursive optimization for the right hand partition.
Stephan Eggermont almost 14 years

But not strong enough to do it in practice. Always sort the smallest partition first to avoid blowing the stack
supercat over 9 years

@StephanEggermont: If the left partition holds millions of items and the right partition holds two, clearly the right partition should be sorted first. Would there by any problem, however, with sorting the left partition first unless it's e.g. more than three times as big as the right one? Worst-case stack depth would be increased, but only by a constant factor.
Stephan Eggermont over 9 years

@supercat that would just be slower. Locality of reference is not practically influenced by doing left or right partition first