Why does Java's Arrays.sort method use two different sorting algorithms for different types?

55,212

Solution 1

The most likely reason: quicksort is not stable, i.e. equal entries can change their relative position during the sort; among other things, this means that if you sort an already sorted array, it may not stay unchanged.

Since primitive types have no identity (there is no way to distinguish two ints with the same value), this does not matter for them. But for reference types, it could cause problems for some applications. Therefore, a stable merge sort is used for those.

OTOH, a reason not to use the (guaranteed n*log(n)) stable merge sort for primitive types might be that it requires making a clone of the array. For reference types, where the referred objects usually take up far more memory than the array of references, this generally does not matter. But for primitive types, cloning the array outright doubles the memory usage.

Solution 2

According to Java 7 API docs cited in this answer, Arrays#Sort() for object arrays now uses TimSort, which is a hybrid of MergeSort and InsertionSort. On the other hand, Arrays#sort() for primitive arrays now uses Dual-Pivot QuickSort. These changes were implemented starting in Java SE 7.

Solution 3

One reason I can think of is that quicksort has a worst case time complexity of O(n^2) while mergesort retains worst case time of O(n log n). For object arrays there is a fair expectation that there will be multiple duplicate object references which is one case where quicksort does worst.

There is a decent visual comparison of various algorithms, pay particular attention to the right-most graph for different algorithms.

Solution 4

I was taking Coursera class on Algorithms and in one of the lectures Professor Bob Sedgewick mentioning the assessment for Java system sort:

"If a programmer is using objects, maybe space is not a critically important consideration and the extra space used by a merge sort maybe not a problem. And if a programmer is using primitive types, maybe the performance is the most important thing so they use quick sort."

Solution 5

java.util.Arrays uses quicksort for primitive types such as int and mergesort for objects that implement Comparable or use a Comparator. The idea of using two different methods is that if a programmer’s using objects maybe space is not a critically important consideration and so the extra space used by mergesort maybe’s not a problem and if the programmer’s using primitive types maybe performance is the most important thing so use the quicksort.

For Example: This is the example when sorting stability matters.

enter image description here

That’s why stable sorts make sense for object types, especially mutable object types and object types with more data than just the sort key, and mergesort is such a sort. But for primitive types stability is not only irrelevant. It’s meaningless.

Source: INFO

Share:
55,212
zjffdu
Author by

zjffdu

Committer of Pig/Tez/Livy/Zeppelin at The Apache Software Foundation

Updated on September 19, 2020

Comments

  • zjffdu
    zjffdu over 3 years

    Java 6's Arrays.sort method uses Quicksort for arrays of primitives and merge sort for arrays of objects. I believe that most of time Quicksort is faster than merge sort and costs less memory. My experiments support that, although both algorithms are O(n log(n)). So why are different algorithms used for different types?

  • sbridges
    sbridges over 12 years
    The java quicksort is a modified quicksort that does not derade to O(n^2), from the docs "This algorithm offers n*log(n) performance on many data sets that cause other quicksorts to degrade to quadratic performance"
  • likern
    likern almost 9 years
    It is not the main reason. Right after that sentence there was a question, embedded into video about "Why for reference types is used MergeSort?" (because it's stable). I think Sedgewick didn't mention that in video to leave it for question.
  • Jingguo Yao
    Jingguo Yao about 8 years
    Another reason to use quicksort is that on the average case, quicksort is faster than mergesort. Although quicksort does more compares than mergesort, it does much less array accesses. 3-way quicksort can also achieve linear time if the input contains a lot of duplicated entries which is not unusual in practical applications (My guess is that dual pivot quick-sort also has this property ).
  • rogerdpack
    rogerdpack almost 6 years
    For primitive types it doesn't clone the array, it can sort them in place, so I think the only reason is the stability contract, basically...
  • Alexandr
    Alexandr over 5 years
    It is not an answer, why 2 different algorithms have been chosen.