Fastest gap sequence for shell sort?

algorithm performance sorting shellsort

19,891

Solution 1

If your data set has a definite upper bound in size, then you can hardcode the step sequence. You should probably only worry about generality if your data set is likely to grow without an upper bound.

The sequence shown seems to grow roughly as an exponential series, albeit with quirks. There seems to be a majority of prime numbers, but with non-primes in the mix as well. I don't see an obvious generation formula.

A valid question, assuming you must deal with arbitrarily large sets, is whether you need to emphasise worst-case performance, average-case performance, or almost-sorted performance. If the latter, you may find that a plain insertion sort using a binary search for the insertion step might be better than a shellsort. If you need good worst-case performance, then Sedgewick's sequence appears to be favoured. The sequence you mention is optimised for average-case performance, where the number of comparisons outweighs the number of moves.

Solution 2

Ciura's paper generates the sequence empirically -- that is, he tried a bunch of combinations and this was the one that worked the best. Generating an optimal shellsort sequence has proven to be tricky, and the problem has so far been resistant to analysis.

The best known increment is Sedgewick's, which you can read about here (see p. 7).

Solution 3

I would not be ashamed to take the advice given in Wikipedia's Shellsort article,

With respect to the average number of comparisons, the best known gap sequences are 1, 4, 10, 23, 57, 132, 301, 701 and similar, with gaps found experimentally. Optimal gaps beyond 701 remain unknown, but good results can be obtained by extending the above sequence according to the recursive formula h_k = \lfloor 2.25 h_{k-1} \rfloor.

Tokuda's sequence [1, 4, 9, 20, 46, 103, ...], defined by the simple formula h_k = \lceil h'_k \rceil, where h'k = 2.25h'k − 1 + 1, h'1 = 1, can be recommended for practical applications.

guessing from the pseudonym, it seems Marcin Ciura edited the WP article himself.

19,891

Author by

Sawyer

Updated on June 17, 2022

Comments

Sawyer almost 2 years

According to Marcin Ciura's Optimal (best known) sequence of increments for shell sort algorithm, the best sequence for shellsort is 1, 4, 10, 23, 57, 132, 301, 701..., but how can I generate such a sequence? In Marcin Ciura's paper, he said:

Both Knuth’s and Hibbard’s sequences are relatively bad, because they are defined by simple linear recurrences.

but most algorithm books I found tend to use Knuth’s sequence: k = 3k + 1, because it's easy to generate. What's your way of generating a shellsort sequence?
Ivan almost 7 years

Isn't Sedgewick's stuff O(N^(4/3)) while giving O(nlog(n)) best case? I mean there are faster O(nlog^2(n)) worst case sequences but with worse best case...
Chromatix over 3 years

@Ivan The guaranteed worst-case O(n log^2 n) sequence (by Pratt) is considerably slower than any of the other common sequences, for any practical problem size. This is simply because it requires a large number of passes over the data. But you could reasonably take the Ciura sequence for good average-case performance, and use a Pratt-type sequence to extend it, eg. 23^i * 57^j for all positive integer i and j where the product exceeds 701. That would give a sequence that still has O(n log^2 n) asymptotic worst case.