Merge Sort a Linked List

algorithm sorting linked-list mergesort

93,030

Solution 1

Wonder why it should be big challenge as it is stated here, here is a straightforward implementation in Java with out any "clever tricks".

//The main function
public static Node merge_sort(Node head) 
{
    if(head == null || head.next == null) 
        return head;
        
    Node middle = getMiddle(head);      //get the middle of the list
    Node left_head = head;
    Node right_head = middle.next; 
    middle.next = null;             //split the list into two halfs

    return merge(merge_sort(left_head), merge_sort(right_head));  //recurse on that
}

//Merge subroutine to merge two sorted lists
public static Node merge(Node a, Node b)
{
    Node dummyHead = new Node();
    for(Node current  = dummyHead; a != null && b != null; current = current.next;)
    {
        if(a.data <= b.data) 
        {
            current.next = a; 
            a = a.next; 
        }
        else
        { 
            current.next = b;
            b = b.next; 
        }
        
    }
    dummyHead.next = (a == null) ? b : a;
    return dummyHead.next;
}

//Finding the middle element of the list for splitting
public static Node getMiddle(Node head)
{
    if(head == null) 
        return head;
    
    Node slow = head, fast = head;
    
    while(fast.next != null && fast.next.next != null)
    {
        slow = slow.next;
        fast = fast.next.next;
    }
    return slow;
}

Solution 2

A simpler/clearer implementation might be the recursive implementation, from which the NLog(N) execution time is more clear.

typedef struct _aList {
    struct _aList* next;
    struct _aList* prev; // Optional.
    // some data
} aList;

aList* merge_sort_list_recursive(aList *list,int (*compare)(aList *one,aList *two))
{
    // Trivial case.
    if (!list || !list->next)
        return list;

    aList *right = list,
          *temp  = list,
          *last  = list,
          *result = 0,
          *next   = 0,
          *tail   = 0;

    // Find halfway through the list (by running two pointers, one at twice the speed of the other).
    while (temp && temp->next)
    {
        last = right;
        right = right->next;
        temp = temp->next->next;
    }

    // Break the list in two. (prev pointers are broken here, but we fix later)
    last->next = 0;

    // Recurse on the two smaller lists:
    list = merge_sort_list_recursive(list, compare);
    right = merge_sort_list_recursive(right, compare);

    // Merge:
    while (list || right)
    {
        // Take from empty lists, or compare:
        if (!right) {
            next = list;
            list = list->next;
        } else if (!list) {
            next = right;
            right = right->next;
        } else if (compare(list, right) < 0) {
            next = list;
            list = list->next;
        } else {
            next = right;
            right = right->next;
        }
        if (!result) {
            result=next;
        } else {
            tail->next=next;
        }
        next->prev = tail;  // Optional.
        tail = next;
    }
    return result;
}

NB: This has a Log(N) storage requirement for the recursion. Performance should be roughly comparable with the other strategy I posted. There is a potential optimisation here by running the merge loop while (list && right), and simple appending the remaining list (since we don't really care about the end of the lists; knowing that they're merged suffices).

Solution 3

Heavily based on the EXCELLENT code from: http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

Trimmed slightly, and tidied:


typedef struct _aList {
    struct _aList* next;
    struct _aList* prev; // Optional.
       // some data
} aList;

aList *merge_sort_list(aList *list,int (*compare)(aList *one,aList *two))
{
    int listSize=1,numMerges,leftSize,rightSize;
    aList *tail,*left,*right,*next;
    if (!list || !list->next) return list;  // Trivial case

    do { // For each power of two<=list length
        numMerges=0,left=list;tail=list=0; // Start at the start

        while (left) { // Do this list_len/listSize times:
            numMerges++,right=left,leftSize=0,rightSize=listSize;
            // Cut list into two halves (but don't overrun)
            while (right && leftSize<listSize) leftSize++,right=right->next;
            // Run through the lists appending onto what we have so far.
            while (leftSize>0 || (rightSize>0 && right)) {
                // Left empty, take right OR Right empty, take left, OR compare.
                if (!leftSize)                  {next=right;right=right->next;rightSize--;}
                else if (!rightSize || !right)  {next=left;left=left->next;leftSize--;}
                else if (compare(left,right)<0) {next=left;left=left->next;leftSize--;}
                else                            {next=right;right=right->next;rightSize--;}
                // Update pointers to keep track of where we are:
                if (tail) tail->next=next;  else list=next;
                // Sort prev pointer
                next->prev=tail; // Optional.
                tail=next;          
            }
            // Right is now AFTER the list we just sorted, so start the next sort there.
            left=right;
        }
        // Terminate the list, double the list-sort size.
        tail->next=0,listSize<<=1;
    } while (numMerges>1); // If we only did one merge, then we just sorted the whole list.
    return list;
}

NB: This is O(NLog(N)) guaranteed, and uses O(1) resources (no recursion, no stack, nothing).

Solution 4

One interesting way is to maintain a stack, and only merge if the list on the stack has the same number of elements, and otherwise push the list, until you run out of elements in the incoming list, and then merge up the stack.

Solution 5

Here's an alternative recursive version. This does not need to step along the list to split it: we supply a pointer to a head element (which is not part of the sort) and a length, and the recursive function returns a pointer to the end of the sorted list.

element* mergesort(element *head,long lengtho)
{ 
  long count1=(lengtho/2), count2=(lengtho-count1);
  element *next1,*next2,*tail1,*tail2,*tail;
  if (lengtho<=1) return head->next;  /* Trivial case. */

  tail1 = mergesort(head,count1);
  tail2 = mergesort(tail1,count2);
  tail=head;
  next1 = head->next;
  next2 = tail1->next;
  tail1->next = tail2->next; /* in case this ends up as the tail */
  while (1) {
    if(cmp(next1,next2)<=0) {
      tail->next=next1; tail=next1;
      if(--count1==0) { tail->next=next2; return tail2; }
      next1=next1->next;
    } else {
      tail->next=next2; tail=next2;
      if(--count2==0) { tail->next=next1; return tail1; }
      next2=next2->next;
    }
  }
}

View more solutions

93,030

Author by

Andrew Peters

Microsoft employee working on EF Code First & Migrations. Creator of EF Power Tools, LightSpeed, NHaml, Inflector.NET, PowerShell Gadget & VS File Explorer.

Updated on July 09, 2022

Comments

Andrew Peters almost 2 years

I was recently brushing up on some fundamentals and found merge sorting a linked list to be a pretty good challenge. If you have a good implementation then show it off here.
- Angus Johnson over 6 years
  
  stackoverflow.com/a/46319131/359538
- snehil suresh wakchaure about 2 years
  
  Cross-posting a recursive implementation in the solution here - stackoverflow.com/questions/71443824/…
hookenz almost 13 years

I thought I'd try this code on my own Linked List. For some reason it runs slower than the recursive version over an int list of 10 million items. It took around 6-7 seconds for the recursive version and around 10 for this one?
Dave Gamble about 12 years

No surprise. The recursive one uses extra storage to speed things up.
Ed Wynn almost 12 years

The overall strategy is that we hold two chains of sublists, extending from the two dummy elements head1 and head2. A sublist is known to be sorted. We make several passes, merging the first sublist from head1 with the first from head2, then the second with the second, and so on. (It is essential that there are equal numbers of sublists, or one extra in head1's chain.) The newly merged sublists are attached alternately to the first and second chain, in place, stably, and without recursion.
Ed Wynn almost 12 years

A significant quirk of this implementation is that it uses a second pointer, e->spare, with each element. Before the end of a sublist, e->next gives the next element. At the end, e->next is NULL. The start of the next sublist (if any) is given by e->spare. At the end of the sort, the entire list is linked via ->next. Knuth's pseudo-code used array indices instead of pointers, and a negative index announced the end of a sublist (and the absolute value gave the next sublist's start).
Ed Wynn almost 12 years

Step L1 arranges the input list into sublists. A "vanilla" version starts with all sublists of length 1. The "clever" version here keeps any ordered sequences in the input list. In particular, if the list is sorted on arrival, the sort terminates after (n-1) comparisons. The clever version therefore gives a massive saving on sorted input, and a lesser saving on input that has some bias towards being sorted. On random input, the clever version is usually very slightly faster (by a couple of percent) although it uses more comparisons.
Ed Wynn almost 12 years

As I said at the start, I'm not expecting anyone to like this as an algorithm (unless you often sort an almost-perfectly-sorted list). I've added this (to a fairly old post) to save you the trouble and disappointment I've just gone through ;-)
doynax over 10 years

I came up with essentially the same implementation, except with pointers-to-pointers instead of dummy nodes, thinking clearly my innovative code must be a quantum leap in computer science. Nothing new under the sun I suppose. Any suggestions for a clean way of speeding up the mostly pre-sorted case?
Marus Gradinaru almost 9 years

This is absolutely brilliant ! I converted it to Delphi and it works very nice. Thank you, sir !
ideasman42 almost 9 years

The comments look like they're not updated to match the code. They refer to variables which don't exist in the code p q & k which (I think) should be left right & block_size respectively.
ideasman42 almost 9 years

Made an improved version of this answer: gist.github.com/ideasman42/5921b0edfc6aa41a9ce0 1) Use a pointer to the tail (remove 2x conditional checks, reduces code-size). 2) Avoid re-assigning empty values the size doesn't change. 3) Corrected comments.
Shital Shah almost 9 years

Thanks @ideaman42. I've added one improvement in above code. For tail_p, there is no direct C# equivalent so it stays same :(.
ideasman42 almost 9 years

This is the best implementation I've found so far, made a portable implementation of it (which can be included directly, with addition of optiona thunk argument ~ like qsort_r). See gist.github.com/ideasman42/…
ideasman42 almost 9 years

While this is quite good, the version from Mono's eglib performs slightly faster in my tests (optimized C) ~10-20%, See: stackoverflow.com/a/18246901/432509
rcgldr about 8 years

The mono eglib method is similar to what I posted in my answer and both are essentially the same as HP / Microsoft STL std::list::sort(). In the mono eglib example, the "recursion" table is reversed, rank[i] points to a run of length 2 to the power i, except the last entry rank[MAX_RANKS-1] points to a list of unlimited size, and is added to by merging runs of length 2 to the power (MAX_RANK-2). A MAX_RANK of 26 to 32 is more than enough.
rcgldr about 8 years

A similar strategy is used in a floating point summation function where an array of partial sums indexed by the exponent of a floating point number is used to hold the partial sums, so that only values with identical exponents are added, until it's time to return the full sum by adding up the values in the array from smallest to largest. I'm not sure which was invented first, the summation function or the linked list merge sort.
Ravi almost 7 years

First of all, your answer is nowhere matching with OP question. Second, not sure, what are your commenting ?
greybeard almost 6 years

non-web.archive explanation in stead of stale link removed in revision 4.
Rick over 5 years

while(fast != null && fast.next != null), doesn't it cause a infinite recursion for a list that only has 2 elements?
Mark Ransom almost 4 years

@Rick the first line of mergeSortList checks the same condition and breaks the recursion in that case.
Chandra Shekhar over 3 years

Thanks for the modular code. I have one doubt regarding the condition for getting the middle element of the list . I am using while(slow ! = nullptr && fast->next != nullptr){ slow = slow->next; fast = fast->next->next; } It is not giving me the proper output. it is crashing. could you please explain the condition that you've use ie: while(fast->next != nullptr && fast->next->next != nullptr){ slow = slow->next; fast = fast->next->next; } Why are we checking for fast->next->next ? Thanks