Why binary search is not possible in sorted linked list?

11,747

Solution 1

Binary search on a sorted array gives us the result in O(log N) comparisons and O(1) memory utilization. Linear search on a sorted array gives us the result in O(N) comparisons and O(1) memory utilization.

Beyond the normal memory and comparison measurements, we also have the idea of traversal steps. This is important for data structures with no random access. For example, in a linked list, to get to element j from the head, we would need to take j steps forward. These steps can happen without any comparison. As pointed out in the comments, the cost for making a traversal step may be different from the cost for making a comparison. A traversal step here translates to a memory read.

The question is what happens when our data structure is a sorted singly linked list? Is it worth doing binary search?

To address this, we need to look at the performance of binary search on a sorted singly linked list. The code looks like this:

struct Node {
        Node* next;
        int value;
};

Node* binarySearch(Node* n, int v) {
        if (v <= n->value) return n;

        Node *right, *left=n;
        int size = count(n);

        while (size > 1)
        {
                int newSize = (size / 2);

                right = left;
                for (int i = 0; (i < newSize) && (right->next!=nullptr); i++)
                        right = right->next;

                if (v == right->value) return right;
                else if (v > right->value) left = right;

                size -= newSize;
        }

        if (right && (v < right->value)) return right;
        else if (right->next) return right->next;
        else return nullptr;
}

The function binarySearch returns the node with element equal to or just greater than v. The parameter n is the head node in a sorted singly linked list.

It is clear that the outer loop iterates O(log N) times where N = size of the list. For each iteration, we make 2 comparisons, so the total # of comparisons is O(log N).

The number of traversal steps is the number of times right = right->next; gets executed, which is O(N). This is because the # of iterations in the inner loop decreases by half at each iteration of the outer loop, so N/2 + N/4 + ... + 1 = N (plus or minus some wiggle room).

Memory usage is still O(1).

In contrast, linear search through a sorted singly linked list is O(n) traversal steps, O(n) comparisons, and O(1) memory.

So is it worth doing binary search on a singly linked list? The answer is almost always yes, but not quite.

Disregarding the cost to count, what happens if the element we are looking for is the 2nd element in the list? Linear search takes 1 step and 1 comparison. Binary search takes ~ N steps and ~log N comparisons. Reality isn't so clear.

So here's the summary:

Sorted Array

 Binary: O(log N) comparisons, O(1) memory, O(log N) traversal steps
 Linear: O(N) comparisons, O(1) memory, O(N) traversal steps

Although, technically, the # of required traversal steps is 0 for sorted arrays. We never have to step forward or backwards. The idea doesn't even make sense.

Sorted Singly Linked List

 Binary: O(log N) comparisons, O(1) memory, O(N) traversal steps
 Linear: O(N) comparisons, O(1) memory, O(N) traversal steps

And these are worst case run time. However, the glass may not always be half empty :p

Solution 2

A linked list only allows sequential access, so binary search is impossible even if the list is sorted.

Edit: As others have pointed out, binary search is possible, but would be pointless.

We can emulate random access in a linked list, but that would be slow and has a time complexity of O(n) on average, so a binary search(which is normally O(lgn)) will take O(nlgn).

Edit 2: as @ethang pointed out, if it's a doubly linked list, the binary search can take only O(n). In each step, we can start from the previous position instead of from the head/tail, so the distance moved will half each time.

If you must use a linked list, you'd better using a linear search, whose complexity is only O(n), and is simpler than a binary search.

If you want to both search and insert/remove effectively, you may use other data strictures such as a binary search tree.

Solution 3

ethang shows how to perform binary search in a singly linked list with just O(1) extra space, O(n) traversal time, and O(log n) comparisons. I did not previously believe that this was possible. For fun, I figured I'd implement a similar algorithm in Haskell:

bs :: Ord k => Int -> k -> [(k,v)] -> Maybe v
bs 0 _ _ = Nothing
bs 1 needle ((k,v) : _)
  | k == needle = Just v
  | otherwise = Nothing
bs size needle left = case drop size' left of
    right@((k,_):_)
      | needle >= k -> bs (size - size') needle right
    _ -> bs size' needle left
  where size' = size `quot` 2

search :: Ord k => k -> [(k,v)] -> Maybe v
search k kvs = bs (length kvs) k kvs

This can be adjusted to use O(log i) comparisons and O(i) traversal time, where i is the distance from the beginning of the list to the location where the sought key is or would be. This implementation could be improved, but the gist is quite simple—replace the search above with this version:

import Control.Applicative ((<|>))

search :: Ord k => k -> [(k,v)] -> Maybe v
-- The 10 can be replaced by any positive integer
search = go 10
  where
    -- The 2 can be replaced by any integer > 1
    go lim needle kvs@((k,_):_) | k <= needle =
      bs lim needle kvs <|> go (lim*2) needle (drop lim kvs)
    go _ _ _ = Nothing
Share:
11,747
Faysal Hasan
Author by

Faysal Hasan

Updated on June 05, 2022

Comments

  • Faysal Hasan
    Faysal Hasan almost 2 years

    Is it possible to search a element with binary search in sorted linked list? If it is not possible then the question is "why it is not possible"?

  • Admin
    Admin about 9 years
    Not impossible (you can simply emulate random access with sequential access), just slow.
  • thang
    thang about 9 years
    Actually it wouldn't be O(n log n). It would be O(n) traversal steps and O(log n) comparisons. As a matter of fact, it is has an upper bound of n (+/- 1 or 2). This is assuming doubly linked list.
  • thang
    thang about 9 years
    Note that by comparison, linear search would be O(n) traversal steps and O(n) comparisons. However, it is simpler, and the average constant on the O(n) traversal steps would probably be smaller (although someone should work out the math here).
  • thang
    thang about 9 years
    Even with singly linked list, it would also be O(n) traversal steps. To get O(n) requires some book keeping, but it should be possible.
  • dfeuer
    dfeuer about 9 years
    This C (C++?) code always looks like line noise, but very nice.
  • dfeuer
    dfeuer about 9 years
    Unless you're in a RAM model, those array accesses will not be quite that cheap.
  • thang
    thang about 9 years
    @dfeuer, well it actually depends on your specific use case. You can imagine a scenario where the entire list fits into cache. In this case traversal is fast. On the other hand, comparisons are bad because branching can cause misprediction (and flushing of pipeline and instruction cache) and other penalties. So in this scenario, you would definitely want to do binary.
  • Bergi
    Bergi about 9 years
    +1 for Haskell just as for your old answer :-) I do wonder though whether it wouldn't be more efficient if we'd use O(log n) extra space to store the heads of the lefts we'd came by already, so we would need to do only n traversal steps instead of 2n. What do you think?
  • dfeuer
    dfeuer about 9 years
    @Bergi, I thought about your comment all through my shower, but I couldn't figure out which heads you were talking about (aside from the fact that they must surely have been heads of rights rather than lefts).
  • Bergi
    Bergi about 9 years
    Yeah, I mean the "middles" which you named right in the code. In case we go left (multiple times), drop is called multiple times on the same list, doing unnecessary traversals over the same nodes again. If we could build a list from the heads of these "middle nodes" in the leftmost branch while traversing until right, we could save some other node traversals. I guess if we limit the length of that list we could still take constant space but avoid lots of drop traversals.
  • dfeuer
    dfeuer about 9 years
    @Bergi, if we save all of them, that's O(n) extra. I guess you could find a way to choose just log n of them, but I'm not sure how to decide which.
  • thang
    thang about 9 years
    Note that as it is in the worst case scenario, you'd make n traversal steps, not 2n. There is really no way to make this any better. Even if you use up O(n) in space. Suppose the element you want to find is the last element. There is no way to get to it without touching every other elements.
  • dfeuer
    dfeuer about 9 years
    @ethang, the remaining interesting challenge is to change it from O(n)+O(log n) to O(i)+O(log i), making sure that elements near the front are fast to find by giving up the strictly binary search. Of course, this makes more distant elements more expensive to find.
  • thang
    thang about 9 years
    @dfeuer, I have a feeling this isn't possible. With some thought, it may be possible to come up with a rigorous proof. Loosely speaking, if your element is at 1, then there is no way to quickly reach it. You can reduce this by increasing the # of compares. There may be a middle ground where you increase the # of comparisons and decreasing the traversal steps. For example, instead of bisecting, you can step in 1/3 intervals. Alternatively, you can also selectively sample k elements are you walk to 1/2. This would increase the constant of O(log n) and decrease O(n) for small i.
  • dfeuer
    dfeuer about 9 years
    @ethang, consider the difference between a standard search tree and a search tree under the left spine view. The idea here would be to artificially limit the search zone, first to, say, 10 elements, then the next 20, then the next 40, etc. The constant factors all get a bit worse.
  • Bergi
    Bergi about 9 years
    @dfeuer: Yeah, I'd only save O(log n) of them at most. I guess you could either use an arbitrary exponential scale (the first, second, fourth, eighth etc) or compute them beforehand by repeatedly dividing size… oh, but actually we want to avoid using length.
  • Bergi
    Bergi about 9 years
    @dfeuer: Thinking about it, I guess my proposed solution is equivalent to simply building the binary search tree structure from the sorted list (O(n)) and then searching in it. Since we're consuming it lazily, it would only take O(log n) space overhead.
  • Mark Ransom
    Mark Ransom about 9 years
    Arrays have traversal steps too, it's just that the mechanism is simpler - adding or subtracting from a pointer or index, rather than dereferencing a pointer. Order analysis doesn't get into the cost of individual operations, just their count.
  • thang
    thang about 9 years
    @MarkRansom, array traversal doesn't require touching any data. I suppose you're right that there is something there, which is why I included those numbers. However, +1, +n can all happen in less than a clock cycle on most modern processors. That is not the case with list traversal. In terms of O(1), the actual constant here is >>>>>> in one case versus the other.