Ways to get the middle of a list in Haskell?

11,209

Solution 1

Two versions

  1. Using pattern matching, tail and init:

    middle :: [a] -> [a]
    middle l@(_:_:_:_) = middle $ tail $ init l
    middle l           = l
    
  2. Using length, take, signum, mod, drop and div:

    middle :: [a] -> [a]
    middle xs = take (signum ((l + 1) `mod` 2) + 1) $ drop ((l - 1) `div ` 2) xs
      where l = length xs
    

The second one is basically a one-liner (but uses where for readability).

Solution 2

I don't make any performance claims, though it only processes the elements of the list once (my assumption is that computing length t is an O(N) operation, so I avoid it), but here's my solution:

mid [] = []                      -- Base case: the list is empty ==> no midpt
mid t = m t t                    -- The 1st t is the slow ptr, the 2nd is fast
  where m (x:_) [_] = [x]        -- Base case: list tracked by the fast ptr has
                                    -- exactly one item left ==> the first item
                                    -- pointed to by the slow ptr is the midpt.
        m (x:y:_) [_,_] = [x,y]  -- Base case: list tracked by the fast ptr has
                                    -- exactly two items left ==> the first two
                                    -- items pointed to by the slow ptr are the 
                                    -- midpts
        m (_:t) (_:_:u) = m t u  -- Recursive step: advance slow ptr by 1, and
                                    -- advance fast ptr by 2.

The idea is to have two "pointers" into the list, one that increments one step at each point in the recursion, and one that increments by two.

(which is essentially what Carl Smotricz suggested)

Solution 3

Just for your amusement, a solution from someone who doesn't speak Haskell:

Write a recursive function that takes two arguments, a1 and a2, and pass your list in as both of them. At each recursion, drop 2 from a2 and 1 from a1. If you're out of elements for a2, you'll be at the middle of a1. You can handle the case of just 1 element remaining in a2 to answer whether you need 1 or 2 elements for your "middle".

Solution 4

I've tried to write a function that returns the middle of a list (2 items for even lengths, 1 for odd) and I'd like to hear how others would implement it in

The right datastructure for the right problem. In this case, you've specified something that only makes sense on a finite list, right? There is no 'middle' to an infinite list. So just reading the description, we know that the default Haskell list may not be the best solution: we may be paying the price for the laziness even when we don't need it. Notice how many of the solutions have difficulty avoiding 2*O(n) or O(n). Singly-linked lazy lists just don't match a quasi-array-problem too well.

Fortunately, we do have a finite list in Haskell: it's called Data.Sequence.

Let's tackle it the most obvious way: 'index (length / 2)'.

Data.Seq.length is O(1) according to the docs. Data.Seq.index is O(log(min(i,n-i))) (where I think i=index, and n=length). Let's just call it O(log n). Pretty good!

And note that even if we don't start out with a Seq and have to convert a [a] into a Seq, we may still win. Data.Seq.fromList is O(n). So if our rival was a O(n)+O(n) solution like xs !! (length xs), a solution like

middle x = let x' = Seq.fromList x in Seq.index(Seq.length x' `div` 2)

will be better since it would be O(1) + O(log n) + O(n), which simplifies to O(log n) + O(n), obviously better than O(n)+O(n).

(I leave as an exercise to the reader modifying middle to return 2 items if length be even and 1 if length be odd. And no doubt one could do better with an array with constant-time length and indexing operations, but an array isn't a list, I feel.)

Solution 5

Haskell solution inspired by Carl's answer.

middle = m =<< drop 1
   where m []  = take 1
         m [_] = take 2
         m (_:_:ys) = m ys . drop 1
Share:
11,209

Related videos on Youtube

Matt Ellen
Author by

Matt Ellen

I write software for kidneys.

Updated on April 15, 2022

Comments

  • Matt Ellen
    Matt Ellen about 2 years

    I've just started learning about Functional Programming, using Haskel.

    I'm slowly getting through Erik Meijer's lectures on Channel 9 (I've watched the first 4 so far) and in the 4th video Erik explains how tail works, and it fascinated me.

    I've tried to write a function that returns the middle of a list (2 items for even lengths, 1 for odd) and I'd like to hear how others would implement it in

    • The least amount of Haskell code
    • The fastest Haskell code

    If you could explain your choices I'd be very grateful.

    My beginners code looks like this:

    middle as | length as > 2   = middle (drop 2 (reverse as))
              | otherwise       = as
    
    • CS_12319
      CS_12319 over 14 years
      Did you test your function thoroughly? middle "abcd" = "ba" doesn't seem quite right :).
    • Matt Ellen
      Matt Ellen over 14 years
      I hadn't tested on small lists! back to the drawing board....
  • Matt Ellen
    Matt Ellen over 14 years
    I like it :D I might use that in normal programming!
  • Long
    Long over 14 years
    Of course Suppressingfire's solution is superior in runtime as well as in code size, so forget this one :D
  • Carl Smotricz
    Carl Smotricz over 14 years
    We seem to agree that Haskell programming is not normal programming :) But glad you liked it.
  • yfeldblum
    yfeldblum over 14 years
    Take a look at Data.List.catMaybes to get rid of the map . filter.
  • jberryman
    jberryman over 14 years
    I would move the pattern match on [] to the mid function so that we don't have to test it more than once: mid [] = []. Looks good though!
  • codebliss
    codebliss over 14 years
    Meh, it's naive, doesn't work if the list's head and tail share an equal element n-k and k, for n is the number of elements in the list. Damn. Aw well it was fun!
  • codebliss
    codebliss over 14 years
    Justice: I was trying to even do this without nub (just thought about it and it solves a small problem but a larger more hidden one remained a commented about) because it isn't in prelude. But thank you, I did not know that!
  • Matt Ellen
    Matt Ellen over 14 years
    I really like the first one. So short and simple. As an aside; how do back ticks work in Haskell?
  • Matt Ellen
    Matt Ellen over 14 years
    Functional programming is causing my mind to try and bend in ways that it couldn't before. Thanks for your answer, it's really making me think!
  • Stephan202
    Stephan202 over 14 years
    @Matt: thanks :). The backticks convert a binary function to an infix operator. So, fun :: a -> b -> c can be called as fun a b but also as a `fun` b.
  • Apocalisp
    Apocalisp over 14 years
    It's nicer to say: middle . tail . init $ l
  • Stephan202
    Stephan202 over 14 years
    Did you test this code? length xs == odd compares an Int to a function. length xs $ `div` 2 doesn't even parse.
  • Carl Smotricz
    Carl Smotricz over 14 years
    Don't understand a word of it but it looks nice and concise and appears to reflect the spirit of my algorithm. +1!
  • Jonno_FTW
    Jonno_FTW over 14 years
    I didn't please forgive me. I'll fix it up and it is way too late in the morning to be thinking straight.
  • Jonno_FTW
    Jonno_FTW over 14 years
    Fixed. I do like simplicity and readability when it comes about.
  • Svante
    Svante over 14 years
    You actually cannot avoid it.
  • Suppressingfire
    Suppressingfire over 14 years
    @jberryman, good point, since that pattern won't be hit in recursion, only in the initial call to mid. I updated the code to reflect your suggestion
  • Jonno_FTW
    Jonno_FTW over 14 years
    The obvious improvement I can see is that should be able to deal with a list of any length
  • Rayne
    Rayne over 14 years
    It can as of 15 hours ago. I just added the error because anything south of 3 elements is kind of pointless to deal with. How do you deal with a list of 2 elements? I could return 1 and 2 I guess, but I'm not sure why I would.
  • Rayne
    Rayne over 14 years
    Got rid of the error line and added some matches for length 1 and two lists.
  • Matt Ellen
    Matt Ellen over 14 years
    Thanks gwern. That's a useful explanation. I'll look into Data.Squence.
  • carnieri
    carnieri over 14 years
    Great explanation! The maxim "right datastructure for the right problem" says it all. Just a side note: O(log n) + O(n) is not obviously better than O(n) + O(n). Both simplify to O(n), so you would have to know the constant factors to really tell which is better.
  • gwern
    gwern over 14 years
    Carnieri: I did think about that for a while, but one O(n) cancels out another, leaving log(n) vs n; offhand, I can't think of any non-negative numbers where log2(n) >= n or log10(n) >= n. (Natural logs would be the counterexample, though.) Given the general fuzziness of Big-O, I think it's safe to assume that the former is better.
  • Ben
    Ben almost 11 years
    Nope, you can't cancel O(n) terms that way. Otherwise you could reason that O(n) + O(n) is still O(n), and then cancel, and conclude that the side with no log term is better! Both algorithms are simply O(n); you need analysis of exactly the things that are ignored in asymptotic analysis to know which is better, so you can't do it from the big-oh terms.
  • Doomjunky
    Doomjunky almost 10 years
    The first one has a time complexity of O(n^3), which is bad for this kind of operation. You could do it in O(1), see gwerns answer.
  • dfeuer
    dfeuer almost 10 years
    m =<< drop 1 is the same as drop 1 >>= m. Since drop 1 is a function, the funny-looking instance of Monad for (r->) applies, so this equals \r -> m ((drop 1) r) r, or \r -> m (drop 1 r) r. We can write the definition more transparently as middle r = m (drop 1 r) r. What does m do? Well, m [] r = take 1 r, so if r has one element, m (drop 1 r) r = r. m [_] r = take 2 r, so if r has two elements, m (drop 1 r) r = r. (cont)
  • dfeuer
    dfeuer almost 10 years
    Otherwise, m (drop 1 r) r = m (drop 2 (drop 1 r)) (drop 1 r) = m (drop 3 r) (drop 1 r). Now we see this is the tortoise-hare approach: the first argument to m is the hare and the second is the tortoise.
  • dfeuer
    dfeuer almost 10 years
    If the list (and whatever crud comes with it) does not fit in the CPU cache, but around half of it does, then the tortoise-hare approach may be faster—conses pulled in by the hare will still be in cache when the tortoise reaches them. That approach also has a potential memory advantage, especially if the list is generated lazily: the tortoise can eat the list as it goes.
  • dfeuer
    dfeuer almost 10 years
    May I suggest replacing the last line with m (_ : _ : ys) = m ys . drop 1 for the sake of symmetry?
  • Isti115
    Isti115 almost 2 years
    May I ask what role signum plays in the second solution? As far as I can see, the result of the modulus will be either 0 or 1 anyway. Or am I overlooking something?