How do I take the last n elements of a list

list haskell

25,088

Solution 1

This should have the property of only iterating the length of the list once. N for drop n and n - 1 for zipLeftover.

zipLeftover :: [a] -> [a] -> [a]
zipLeftover []     []     = []
zipLeftover xs     []     = xs
zipLeftover []     ys     = ys
zipLeftover (x:xs) (y:ys) = zipLeftover xs ys

lastN :: Int -> [a] -> [a]
lastN n xs = zipLeftover (drop n xs) xs

Here is an alternative shorter and perhaps better since as Satvik pointed out it is often better to use recursion operators then explicit recursion.

import Data.Foldable

takeLeftover :: [a] -> t -> [a]
takeLeftover [] _ = []
takeLeftover (x:xss) _ = xss

lastN' :: Int -> [a] -> [a]
lastN' n xs = foldl' takeLeftover xs (drop n xs)

Also note Will Ness's comment below that takeLeftover is just:

takeLeftover == const . drop 1

Which makes things rather tidy:

lastN' :: Int -> [a] -> [a]
lastN' n xs = foldl' (const . drop 1) xs (drop n xs)
-- or
-- lastN' n xs = foldl' (const . drop 1) <*> drop n

Solution 2

From what I can tell you can use something like

lastN :: Int -> [a] -> [a]
lastN n xs = drop (length xs - n) xs

But with any implementation on inbuilt list you can not perform better than O(length of list - n).

It looks like you are trying to use list for something it was not meant to perform efficiently. Use Data.Sequence or some other implementation of list which allows operations to be performed at the end of the list efficiently.

Edit:

Davorak's implementation looks like to be the most efficient implementation you can get from inbuilt list. But remember there are intricacies other than just the running time of a single function like whether it fuse well with other functions etc.

Daniel's solution uses inbuilt functions and has the same complexity as of Davorak's and I think has better chances to fuse with other functions.

Solution 3

Not sure whether it's terribly fast, but it's easy:

lastR n xs = snd $ dropWhile (not . null . fst) $ zip (tails $ drop n xs) (tails xs)

Solution 4

Be aware that whatever you do, you are going to need to iterate through the entire list. That said, you can do a bit better than reverse (take n (reverse xs)) by computing the length of the list first, and dropping the appropriate number of elements:

lastN :: Int -> [a] -> [a]
lastN n xs = let m = length xs in drop (m-n) xs

Solution 5

Here's a simplification of Davorak's first solution:

-- dropLength bs = drop (length bs)
dropLength :: [b] -> [a] -> [a]
dropLength [] as = as
dropLength _ [] = []
dropLength (_ : bs) (_ : as) = dropLength bs as

lastR :: Int -> [a] -> [a]
lastR n as = dropLength (drop n as) as

When n <= length as, length (drop n as) = length as - n, so dropLength (drop n as) as = drop (length (drop n as)) as = drop (length as - n) as, which is the last n elements of as. When n > length as, dropLength (drop n as) as = dropLength [] as = as, which is the only sensible answer.

If you want to use a fold, you can write

dropLength :: [b] -> [a] -> [a]
dropLength = foldr go id
  where
     go _b _r [] = []
     go _b r (_a : as) = r as

That won't make any difference for lastR, but in other applications it could win you some list fusion.

View more solutions

25,088

Joachim Breitner

Updated on July 09, 2022

Comments

Joachim Breitner almost 2 years

To obtain the last n elements of a list xs, I can use reverse (take n (reverse xs)), but that is not very good code (it keeps the complete list in memory before returning anything, and the result is not shared with the original list).

How do I implement this lastR function in Haskell?
- Boris Dalstein almost 11 years
  
  Never tried Haskell but know OCaml and one line has striken me: how the result can be shared with the original list? Ain't the objects immutable in Haskell? (as in most functional language)
- Satvik almost 11 years
  
  @Boris Yes. But when you do something like drop 1 xs dropping the first element from list. Instead of creating a new list of 1 less element, haskell usually optimizes this by just changing the pointer to point to the second element of the original list.
- Boris Dalstein almost 11 years
  
  Thx for the clarification. In fact, I must have been tired ;-) Indeed, it -because- it is immutable that I guess you can do this optimization: sharing the address is still safe and preserves the copy-by-value semantics since you know that the object will not change anyway. Correct me if I'm wrong, not been playing with those things since years.
- fp_mora about 6 years
  
  tailn n xs = drop (length xs - n) xs ---- take away what you don't want from the front of the list, leaving what you do want, the remainder of the list.
- Will Ness almost 5 years
  
  a curiosity: \n xs -> [x | [x] <- transpose [drop n xs, xs]]. with sharing: (\n xs -> foldr (\_ r (_:z) -> r z) id (drop n xs) xs) (same as Davorak's foldl' solution)
Joachim Breitner almost 11 years

That’s the one I was looking for, see joachim-breitner.de/blog/archives/… for a long treatment of the issue. Note that you can reduce the number of cases in zipLeftover, but that is just cosmetics.
Davorak almost 11 years

Thanks for the link I will check it out. I default to being explicit if I hesitate on a choice and do not want to spend much time thinking about it, hence the extra case.
Davorak almost 11 years

"like whether it fuse well with other functions etc." can you comment on what about Daniel's function makes it more likely to fuse? Or rather what hints are that might make it so.
Satvik almost 11 years

@Davorak I am not an expert on fusion, but a function written using inbuilt functions is more likely to fuse. I think Daniel can comment on this with more information.
Davorak almost 11 years

thanks, that was my impression as well specifically using recursion operators rather then explicit recursion, since the fusion rewrite rules are based off of the operators.
Joachim Breitner almost 11 years

I am not sure that the variant is really good; usually foldl has stack overflow issues. Maybe you meant to use foldr?
Joachim Breitner almost 11 years

Yes, the second variant is not good. Compare lastN 1 [1..] (loops indefinitely, but does not allocate noticable memory) with lastN' 1 [1..] (which fills up the heap quite quickly).
Davorak almost 11 years

Yes but that is because I forgot to make it strict. I should have used foldl' rather then the lazy foldl. The memory profile on lastN' and lastN when using foldl' is the same at least up to 100M elements.
Will Ness almost 11 years

you can write it down as lastN' n = foldl' (const . drop 1) <*> drop n.
Joachim Breitner almost 11 years

You can even use tail instead of drop 1, as takeLeftover will never be called with an empty argument.
Joachim Breitner over 7 years

Any code with xs ++ [y] in a loop is certainly not O(n).
leandromoh over 7 years

but didn't I iterate through the list once?
Joachim Breitner over 7 years

But every step is not constant in size.
Joachim Breitner almost 5 years

Note that you lose sharing: You know have two copies of the (spine of) the suffix in memory, one as part of the original list and one in the returned list.
Nolan almost 5 years

@JoachimBreitner redacted to mention this.
dfeuer almost 5 years

This also allocates the whole reversed list when it's likely that only a portion will even be used.
dfeuer almost 5 years

@JoachimBreitner, I think it's even better to write it so it's efficient whichever argument is longer and give it a proper name. See my answer.
dfeuer almost 5 years

This isn't great because it retains memory longer than necessary. In particular, it holds on to the beginning of the list even after counting more than n elements. The two-pointer solutions avoid this.
windmaomao over 3 years

i can't understand why this is not the answer, so easy to understand and implement, and should be part of the prelude as well.
Will Ness over 2 years

what exactly is O(n)? the proper take 2 $ lastN 10000000 [1..10000000] is O(1), yours is indeed O(n). big difference.
Nolan over 2 years

@WillNess There's no way to reach end of singly linked list in O(1). The code you wrote is O(n) too. There might be performance issue in the solution I proposed but it's not worse than the others from the perspective of algorithmic complexity.
dfeuer over 2 years

Fusion won't help you here, because the list is used twice. This implementation is strictly less efficient. How much data is kept alive is really important for garbage collection performance. When the Int argument is small, this keeps much more of the list live during its calculation.