Fastest safe sorting algorithm implementation

17,286

Solution 1

Does anyone know how to implement my algorithm even faster?

I was able to shave 10% off the execution time by converting your code to use pointers.

    public unsafe static void UnsafeQuickSort(int[] data)
    {
        fixed (int* pdata = data)
        {
            UnsafeQuickSortRecursive(pdata, 0, data.Length - 1);
        }
    }

    private unsafe static void UnsafeQuickSortRecursive(int* data, int left, int right)
    {
        int i = left - 1;
        int j = right;

        while (true)
        {
            int d = data[left];
            do i++; while (data[i] < d);
            do j--; while (data[j] > d);

            if (i < j)
            {
                int tmp = data[i];
                data[i] = data[j];
                data[j] = tmp;
            }
            else
            {
                if (left < j) UnsafeQuickSortRecursive(data, left, j);
                if (++j < right) UnsafeQuickSortRecursive(data, j, right);
                return;
            }
        }
    }

Solution 2

Binary insertion sort almost always wins for short runs (~10 items). It's often better than an ideal sorting network because of the simplified branching structure.

Dual pivot quicksort is faster than quicksort. The linked paper contains a Java implementation that you could presumably adapt.

If you're only sorting integers, a radix sort will likely be faster still on long arrays.

Solution 3

A faster sorting algorithm for arrays of random integers is LSD Radix Sort:

    public static int[] SortRadix(this int[] inputArray)
    {
        const int bitsPerDigit = 8;
        const uint numberOfBins = 1 << bitsPerDigit;
        uint numberOfDigits = (sizeof(uint) * 8 + bitsPerDigit - 1) / bitsPerDigit;
        int d;
        var outputArray = new int[inputArray.Length];

        int[][] startOfBin = new int[numberOfDigits][];
        for (int i = 0; i < numberOfDigits; i++)
            startOfBin[i] = new int[numberOfBins];
        bool outputArrayHasResult = false;

        const uint bitMask = numberOfBins - 1;
        const uint halfOfPowerOfTwoRadix = PowerOfTwoRadix / 2;
        int shiftRightAmount = 0;

        uint[][] count = HistogramByteComponents(inputArray, 0, inputArray.Length - 1);

        for (d = 0; d < numberOfDigits; d++)
        {
            startOfBin[d][0] = 0;
            for (uint i = 1; i < numberOfBins; i++)
                startOfBin[d][i] = startOfBin[d][i - 1] + (int)count[d][i - 1];
        }

        d = 0;
        while (d < numberOfDigits)
        {
            int[] startOfBinLoc = startOfBin[d];

            if (d != 3)
                for (uint current = 0; current < inputArray.Length; current++)
                    outputArray[startOfBinLoc[((uint)inputArray[current] >> shiftRightAmount) & bitMask]++] = inputArray[current];
            else
                for (uint current = 0; current < inputArray.Length; current++)
                    outputArray[startOfBinLoc[((uint)inputArray[current] >> shiftRightAmount) ^ halfOfPowerOfTwoRadix]++] = inputArray[current];

            shiftRightAmount += bitsPerDigit;
            outputArrayHasResult = !outputArrayHasResult;
            d++;

            int[] tmp = inputArray;       // swap input and output arrays
            inputArray = outputArray;
            outputArray = tmp;
        }
        return outputArrayHasResult ? outputArray : inputArray;
    }
    [StructLayout(LayoutKind.Explicit)]
    internal struct Int32ByteUnion
    {
        [FieldOffset(0)]
        public byte byte0;
        [FieldOffset(1)]
        public byte byte1;
        [FieldOffset(2)]
        public byte byte2;
        [FieldOffset(3)]
        public byte byte3;

        [FieldOffset(0)]
        public Int32 integer;
    }
    public static uint[][] HistogramByteComponents(int[] inArray, Int32 l, Int32 r)
    {
        const int numberOfBins = 256;
        const int numberOfDigits = sizeof(ulong);
        uint[][] count = new uint[numberOfDigits][];
        for (int i = 0; i < numberOfDigits; i++)
            count[i] = new uint[numberOfBins];

        var union = new Int32ByteUnion();
        for (int current = l; current <= r; current++)    // Scan the array and count the number of times each digit value appears - i.e. size of each bin
        {
            union.integer = inArray[current];
            count[0][union.byte0]++;
            count[1][union.byte1]++;
            count[2][union.byte2]++;
            count[3][((uint)inArray[current] >> 24) ^ 128]++;
        }
        return count;
    }

It runs at nearly 100 MegaInt32s/sec on a single core - about 7X faster than Array.Sort(), 25X faster than Linq.OrderBy() on a single core and 6X faster than Linq.AsParallel().OrderBy() on 6 cores.

This implementation is taken from the HPCsharp nuget package on nuget.org, which also has versions for sorting arrays of uint[], long[], and ulong[], as well as MSD Radix Sort, which adds float[] and double[] arrays and is in-place.

Share:
17,286
raisyn
Author by

raisyn

Master's Student at Vienna University of Technology Interested in various topics of computer science.

Updated on June 30, 2022

Comments

  • raisyn
    raisyn almost 2 years

    I spend some time implementing a quicksort algorithm in C#. After finishing I compared the speed of my implementation and C#'s Array.Sort-Method.

    I just compare speed working on random int arrays.

    Here's my implementation:

    static void QuickSort(int[] data, int left, int right)
    {
        int i = left - 1,
            j = right;
    
        while (true)
        {
            int d = data[left];
            do i++; while (data[i] < d);
            do j--; while (data[j] > d);
    
            if (i < j) 
            {
                int tmp = data[i];
                data[i] = data[j];
                data[j] = tmp;
            }
            else
            {
                if (left < j)    QuickSort(data, left, j);
                if (++j < right) QuickSort(data, j, right);
                return;
            }
        }
    }
    

    Performance (when sorting an random int[] with length of 100000000):
       - my algorithm: 14.21 seconds
       - .Net Array<int>.Sort: 14.84 seconds

    Does anyone know how to implement my algorithm even faster?
    Or can anyone provide a faster implementation (need not be a quicksort!) which my run faster?

    Note:
       - please no algorithms which use multiple cores/processors to improve perrformance
       - only valid C# source code

    I will test the performance of the provided algorithms within a few minutes if I'm online.

    EDIT:
    Do you think using a ideal sorting network for parts containing less than 8 value would improve performance?

  • raisyn
    raisyn over 13 years
    there is a little bug, but really nice upgrade (+1), 1.1 seconds faster on my hardware!
  • Gabe
    Gabe over 13 years
    Doesn't this clash with the "safe" requirement in the title?
  • raisyn
    raisyn over 13 years
    It's a bit difficult to say... this is simple enough too... I just wanted to be sure that I don't get any high optimized C/C++ codes.
  • Brian Gideon
    Brian Gideon over 13 years
    @Gabe: I thought about that too before I posted. I was not really sure if "safe" was synonomous with "stable". But, considering that the quick sort is not stable that is probably a moot point anyway. I decided to go ahead and post anyway.
  • Brian Gideon
    Brian Gideon over 13 years
    @youllknow: I fixed the bug. Thanks for bring that to my attention.
  • Dolphin
    Dolphin over 13 years
    I would probably have the top level function either not have the left/right args or do some bounds checking before the call.
  • morteza khosravi
    morteza khosravi about 8 years
    I tested the code with some Edge class that have a float Weight field. It doesn't sort correctly. Are you sure the algorithm is fine?
  • Brian Gideon
    Brian Gideon about 8 years
    @mortezakhosravi: No, definitely not. It's essentially copied from the OP and just converted to use unsafe code (which I actually recommend against by the way). It was long time ago, but I seriously doubt I critically analyzed it enough to verify that the algorithm is correct.
  • DragonSpit
    DragonSpit over 3 years
    Another algorithm that has recently been added to the HPCsharp nuget package (open source and free) is a Parallel Hybrid Merge Sort with Array.Sort() as the leaf-node of recursion. This algorithm runs over 40X faster than Array.Sort() on a 32-core AMD processor, and is still a generic sort algorithm