sorting a doubly linked list with merge sort

13,158

Solution 1

Merge sort requires splitting the list quite often. Isn't iterating to the middle of a LinkedList pretty much the most expensive operation you can perform on it (well, short of sorting it)? I could see the merge step working pretty well (you're iterating forwards over two linked lists), but I'm not sure that this implementation is worth the trouble without an O(1) split operation.

Followup

As pointed out to me, the O(n) split operation doesn't really add much to complexity when you're already doing O(n) things during the merge phase. Nevertheless, you're still going to run into trouble doing iteration like you're doing (not using an Iterator but instead using get on a List with poor random-access characteristics).

I was bored while debugging some other issue so wrote you what I consider to be a decent Java implementation of this algorithm. I followed Wikipedia's pseudocode verbatim and sprinkled in some generics and print statements. If you have any questions or concerns, just ask.

import java.util.List;
import java.util.LinkedList;

/**
 * This class implements the mergesort operation, trying to stay
 * as close as possible to the implementation described on the
 * Wikipedia page for the algorithm. It is meant to work well
 * even on lists with non-constant random-access performance (i.e.
 * LinkedList), but assumes that {@code size()} and {@code get(0)}
 * are both constant-time.
 *
 * @author jasonmp85
 * @see <a href="http://en.wikipedia.org/wiki/Merge_sort">Merge sort</a>
 */
public class MergeSort {
    /**
     * Keeps track of the call depth for printing purposes
     */
    private static int depth = 0;

    /**
     * Creates a list of 10 random Longs and sorts it
     * using {@link #sort(List)}.
     *
     * Prints out the original list and the result.
     *
     */
    public static void main(String[] args) {
        LinkedList<Long> list = new LinkedList<Long>();

        for(int i = 0; i < 10; i++) {
            list.add((long)(Math.random() * 100));
        }

        System.out.println("ORIGINAL LIST\n" + 
                           "=================\n" +
                           list + "\n");

        List<Long> sorted = sort(list);

        System.out.println("\nFINAL LIST\n" +
                           "=================\n" +
                           sorted + "\n");
    }

    /**
     * Performs a merge sort of the items in {@code list} and returns a
     * new List.
     *
     * Does not make any calls to {@code List.get()} or {@code List.set()}.
     * 
     * Prints out the steps, indented based on call depth.
     *
     * @param list the list to sort
     */
    public static <T extends Comparable<T>> List<T> sort(List<T> list) {
        depth++;
        String tabs = getTabs();

        System.out.println(tabs + "Sorting: " + list);

        if(list.size() <= 1) {
            depth--;
            return list;
        }

        List<T> left   = new LinkedList<T>();
        List<T> right  = new LinkedList<T>();
        List<T> result = new LinkedList<T>();

        int middle = list.size() / 2;

        int added = 0;
        for(T item: list) {
            if(added++ < middle)
                left.add(item);
            else
                right.add(item);
        }

        left = sort(left);
        right = sort(right);

        result = merge(left, right);

        System.out.println(tabs + "Sorted to: " + result);

        depth--;
        return result;
    }

    /**
     * Performs the oh-so-important merge step. Merges {@code left}
     * and {@code right} into a new list, which is returned.
     *
     * @param left the left list
     * @param right the right list
     * @return a sorted version of the two lists' items
     */
    private static <T extends Comparable<T>> List<T> merge(List<T> left,
                                                           List<T> right) {
        String tabs = getTabs();
        System.out.println(tabs + "Merging: " + left + " & " + right);

        List<T> result = new LinkedList<T>();
        while(left.size() > 0 && right.size() > 0) {
            if(left.get(0).compareTo(right.get(0)) < 0)
                result.add(left.remove(0));
            else
                result.add(right.remove(0));
        }

        if(left.size() > 0)
            result.addAll(left);
        else
            result.addAll(right);

        return result;
    }

    /**
     * Returns a number of tabs based on the current call depth.
     *
     */
    private static String getTabs() {
        StringBuffer sb = new StringBuffer("");
        for(int i = 0; i < depth; i++)
            sb.append('\t');
        return sb.toString();
    }
}

To Run

  1. Save the code to a file named MergeSort.java
  2. Run javac MergeSort.java
  3. Run java MergeSort
  4. Marvel
  5. Optionally, run javadoc -private MergeSort.java to create the documentation. Open the index.html file it creates.

Solution 2

I came across this problem yesterday. Here are some thoughts.

Sorting a DoublyLinkedList is different from sorting an Array as you can not make index based references to any arbitrary item in the List. Instead you need to remember the items during each recursive step and then pass them on to the merge function. For each recursion step you only need to remember the first item of each list half. If you do not remember these items you will quickly end up with indexes, but this leads you to the problem that in your merge-function you need to traverse the whole list with for-loops to find the items to merge. That in turn means you get a complexity of O(n^2).

Another important point is the step of recursing into the list and dividing the list in two halves. You can do this step in the recursive part by using for-loops. Contrary to the merge-part at this step the for-loops will only yield a complexity of O(log(n) * n/2) and this is still below the overall O(n*log(n)) complexity. Here is why:

  1. You always need to find the first item of each half of list part.

  2. In the first recursion step you need to pass the first item and the item at position n/2. This takes n/2 steps to find.

  3. In each following step you need to find the middle item for each of the two halves of the list which gives us n/4 to find the item in the first halve and n/4 in the other halve. In total thats n/2.

  4. In each following recursive step the amount of list parts double and the lengths is divided by two:

    • 4 * n/8 in the 3rd recursion depth

    • 8 * n/16 in the 4th recursion depth, and so on...

  5. The recursion depth is log(n) and in each step we perform n/2 steps. This equals O(log(n)*n/2)

Finally here is some code:

public DoublyLinkedList mergesort(DoublyLinkedList in, int numOfElements) {
    in.first = mergesort(in.first, numOfElements);      
    return in;
}

mergeSort:

public ListElement mergesort(ListElement first, int length) {
    if(length > 1) {
        ListElement second = first;
        for(int i=0; i<length/2; i++) {
            second = second.next;
        }
        first = mergesort(first, length/2);
        second = mergesort(second, (length+1)/2);
        return merge(first, second, length);
    } else {
        return first;
    }
}

and merge:

public ListElement merge(ListElement first, ListElement second, int length) {
    ListElement result = first.prev; //remember the beginning of the new list will begin after its merged
    int right = 0;
    for(int i=0; i<length; i++) {
        if(first.getKey() <= second.getKey()) {
            if(first.next == second) break; //end of first list and all items in the second list are already sorted, thus break
            first = first.next;
        } else {
            if(right==(length+1)/2) 
                break; //we have merged all elements of the right list into the first list, thus break
            if(second == result) result = result.prev; //special case that we are mergin the last element then the result element moves one step back.
            ListElement nextSecond = second.next;
            //remove second
            second.prev.next = second.next;
            second.next.prev = second.prev;
            //insert second behind first.prev
            second.prev = first.prev;
            first.prev.next = second;
            //insert second before first
            second.next = first; 
            first.prev = second;
            //move on to the next item in the second list
            second = nextSecond;
            right++;
        }
    }
    return result.next; //return the beginning of the merged list
}

The maximum amount of memory used is also pretty low (not including the list itself). Correct me if I'm wrong but it should be less than 400 bytes (on 32bit). It would be 12 byte per call on mergeSort times the recursion depth of log(n) plus 20 bytes for the variables of merge thus: 12*log(n)+20 bytes.

P.S. Code tested on 1 million items (takes 1200ms). Also DoublyLinkedList is a container that stores the first ListElement of the list.

Update: I have answered a similar question about Quicksort using the same data structures, however in comparison to this Mergesort implementation it runs much slower. Here are some updated timings for reference:

Mergesort:

1.000.000 Items:  466ms
8.300.000 Items: 5144ms

Quicksort:

1.000.000 Items:  696ms  
8.300.000 Items: 8131ms

Note the timings are specific to my hardware and you might get different results.

Solution 3

It depends on what DoublyLinkedList is - is it a concrete user-defined type, or just an alias name for a linked list type?

In the first case, you should have indexed get/set methods and/or an iterator defined in it, which make the task simple.

In the latter case, why not use the standard java.util.LinkedList?

In terms of the List interface, the operation could be implemented like this:

<T> List<T> merge(List<T> first, List<T> second, List<T> merged) {
  if (first.isEmpty())
    merged.adAll(second);
  else if (second.isEmpty())
    merged.adAll(first);
  else {
    Iterator<T> firstIter = first.iterator();
    Iterator<T> secondIter = second.iterator();
    T firstElem = firstIter.next();
    T secondElem = secondIter.next();

    do {
      if (firstElem < secondElem) {
        merged.add(firstElem);
        firstElem = firstIter.hasNext() ? firstIter.next() : null;
      } else {
        merged.add(secondElem);
        secondElem = secondIter.hasNext() ? secondIter.next() : null;
      }
    } while (firstIter.hasNext() && secondIter.hasNext());
    //copy remaining elements to the tail of merged
    if (firstElem != null)
      merged.add(firstElem);
    if (secondElem != null)
      merged.add(secondElem);
    while (firstIter.hasNext()) {
      merged.add(firstIter.next());
    }
    while (secondIter.hasNext()) {
      merged.add(secondIter.next());
    }
  }
}

This implementation is a bit more tedious than it would be with arrays, mostly since iterators are "consumed" by the next operation, so one must keep account of the current item in each list. With get, the code would be simpler, quite similar to the array solution, however it would be way slower for big lists, as @sepp2k pointed out.

A couple more notes:

  • the Java tradition is to use lowercase variable names, hence localDoublyLinkedList
  • Java has no pointers, only references.

Solution 4

First of all, you must NOT use indexes when dealing with linked lists. Do it like this:

while (i < in.size/2){
  listOne.addLast( in.remove(in.first()) );
  i++
}
while(!in.isEmptly){
  listTwo.addLast( in.remove(in.first()) );
}

And for merging

merge(a, b, out){
  while(!a.empty && !b.empty){
    if(a.first() >= b.first())
      out.addLast( a.remove(a.first()) );
    else
     out.addLast( b.remove(b.first()) );

  //remember to take care of the remaining elements 
  while(!a.empty)
    out.addLast( a.remove(a.first()) );
  while(!b.empty)
    out.addLast( b.remove(b.first()) );
}

This way it will still be O(n log n)

Share:
13,158
user329820
Author by

user329820

Updated on July 19, 2022

Comments

  • user329820
    user329820 almost 2 years

    I have found this code in the internet and it was for arrays ,I want to change it for doubly linked list(instead of index we should use pointer) would you please help me that how can i change merge method(I have changed sort method by myself) also this is not my home work ,I love working with linked list!!

    public class MergeSort {
    
    private DoublyLinkedList LocalDoublyLinkedList;
    
    public MergeSort(DoublyLinkedList list) {
        LocalDoublyLinkedList = list;
    
    }
    
    public void sort() {
    
        if (LocalDoublyLinkedList.size() <= 1) {
            return;
        }
        DoublyLinkedList listOne = new DoublyLinkedList();
        DoublyLinkedList listTwo = new DoublyLinkedList();
        for (int x = 0; x < (LocalDoublyLinkedList.size() / 2); x++) {
            listOne.add(x, LocalDoublyLinkedList.getValue(x));
    }
    for (int x = (LocalDoublyLinkedList.size() / 2) + 1; x < LocalDoublyLinkedList.size`(); x++) {`
        listTwo.add(x, LocalDoublyLinkedList.getValue(x));
    }
    //Split the DoublyLinkedList again
        MergeSort sort1 = new MergeSort(listOne);
        MergeSort sort2 = new MergeSort(listTwo);
        sort1.sort();
        sort2.sort();
    
        merge(listOne, listTwo);
    }
    
    private void merge(DoublyLinkedList a, DoublyLinkedList b) {
        int x = 0;
        int y = 0;
        int z = 0;
        while (x < first.length && y < second.length) {
            if (first[x] < second[y]) {
                a[z] = first[x];
                x++;
            } else {
                a[z] = second[y];
                y++;
            }
            z++;
        }
    //copy remaining elements to the tail of a[];
        for (int i = x; i < first.length; i++) {
            a[z] = first[i];
            z++;
        }
        for (int i = y; i < second.length; i++) {
            a[z] = second[i];
            z++;
        }
    }
    }
    
  • sepp2k
    sepp2k almost 14 years
    Mentioning indexed get/set methods without also mentioning that they're O(n) for linked lists seems a bit dangerous to me. You should definitely not use get and set when writing a sorting algorithm.
  • Eyal Schneider
    Eyal Schneider almost 14 years
    The split operation is expensive indeed, but note that the overall complexity is still optimal. The recurrence relation is T(N) = 2T(N/2)+1.5N, and it can be shown easily that T(N) = O(N log N)