Haskell- Two lists into a list of tuples

10,454

Solution 1

This is a great exercise!

If we lay out the pairs of your input in an infinite table:

(0,A)  (1,A)  (2,A)  (3,A) ...
(0,B)  (1,B)  (2,B)  (3,B) ...
(0,C)  (1,C)  (2,C)  (3,C) ...
(0,D)  (1,D)  (2,D)  (3,D) ...
...

The trick is to traverse the table in upward diagonal stripes. Trace the table with your eyes. The stripes of this table are:

(0,A)
(0,B) (1,A)
(0,C) (1,B) (2,A)
(0,D) (1,C) (2,B) (3,A)
...

All the stripes are finite, yet every element of the table is in one of them, so if you concatenate them together every element will appear at a finite position in the concatenated result.

Here's the gameplan I'd suggest:

Implement stripes :: [[a]] -> [[a]] which extracts the list of stripes from an infinite array like above (start by assuming that all lists are infinite, i.e. don't worry about the [] cases; once you have that working, correct it to work on lists that might be finite).

Using stripes, implement diagonal :: [[a]] -> [a] which concatenates all the stripes (this is a one-liner).

Finally, implement your function by applying diagonal of a particular 2D table [[(a,b)]], which is the table I started this answer with (and can be constructed using a nested list comprehension, among other various ways).

Notes:

  • The name zip is misleading. This is more like a cartesian product.

  • You know you can match patterns inside patterns, right? I.e. if f :: [[a]] -> something

    f ((x:xs):xss) = ...
    

    Gives you x as the first element of the first row, xs is the rest of the first row, and xss is the rest of the table.

Solution 2

Here's the helper function you posted:

oneList :: [a] -> [b] [(a,b)]
oneList [] _ = []
oneList x:xs y:ys = [(x,y)] ++ oneList

And here are the syntax errors it contains:

  • you left out an arrow in the type annotation; it should be

    oneList :: [a] -> [b] -> [(a,b)]
    
  • you need to enclose non-trivial patterns in parens, so the second equation should start

    oneList (x:xs) (y:ys) =
    
  • oneList takes two arguments before giving back a list, but in the rhs of the second equation you try to use it as a list without giving it any arguments

(Btw, it usually helps us if you post the error messages instead of just saying it doesn't compile. Compare the errors I've point out above to the error messages the compiler gave you.)


But as you note, your algorithm is wrong.

I sense this is homework, so I am only going to give you a hint.

zipInf should be

zipInf :: [a] -> [b] -> [(a,b)]
zipInf xs ys = thread (expand xs ys)

thread and expand are two helper functions I am leaving you to write, with type signatures

expand :: [a] -> [b] -> [[(a,b)]]
thread :: [[c]] -> [c]

expand is fairly simple. thread is where you have to be careful to include elements in the right order (hence you can't just say thread zs = concat zs, even though the type is right).

Solution 3

While this is a great exercise for getting to understand lists and Haskell in general, it is also a great exercise for understanding what the Applicative class is all about. In particular, the [] instance for Applicative. Your zipInf that you want is exactly liftA2 (,)

λ: :t liftA2
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
λ: :t (,)
(,) :: a -> b -> (a, b)
λ: :t liftA2 (,)
liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)

We just need to make sure that [] is an Applicative.

λ: :i []
...
instance Applicative [] -- Defined in `Control.Applicative'
...

So it's an Applicative. It might make it easier to understand if we annotate our type a bit

λ: :t liftA2 (,) `asAppliedTo` []
[a] -> [b] -> [(a, b)]

Yep, that's the same type.

λ: liftA2 (,) [0..2] ['A'..'C']
[(0,'A'),(0,'B'),(0,'C'),(1,'A'),(1,'B'),(1,'C'),(2,'A'),(2,'B'),(2,'C')]

Looks like it works! So you didn't have to do anything to write this, and it's arguably more understandable than a recursive definition would be. Plus, you didn't have to worry about edge cases like you would when rolling your own solution.

You can also write it a bit more idiomantically using <$> (or fmap) and <*>.

λ: (,) <$> [0..2] <*> ['A'..'C']
[(0,'A'),(0,'B'),(0,'C'),(1,'A'),(1,'B'),(1,'C'),(2,'A'),(2,'B'),(2,'C')]
λ: take 9 $ (,) <$> "A" <*> [0..]
[('A',0),('A',1),('A',2),('A',3),('A',4),('A',5),('A',6),('A',7),('A',8)]

Or you could leverage the full power of Monad (which is quite unnecessary in this case):

λ: do {n <- [0..2]; c <- ['A'..'C']; return (n, c)}
[(0,'A'),(0,'B'),(0,'C'),(1,'A'),(1,'B'),(1,'C'),(2,'A'),(2,'B'),(2,'C')]

Also, if you're wondering how you could get different semantics from Applicative for [] there is at least one other List instance for Applicative: ZipList

λ: :i ZipList
newtype ZipList a = ZipList {getZipList :: [a]}
    -- Defined in `Control.Applicative'
instance Functor ZipList -- Defined in `Control.Applicative'
instance Applicative ZipList -- Defined in `Control.Applicative'

This instance provides zipping style semantics for its Applicative instance.

λ: getZipList $ (,) <$> ZipList [0..2] <*> ZipList ['A'..'C']
[(0,'A'),(1,'B'),(2,'C')]

Both of these are good introductions to the Applicative typeclass, because they're readily available, fairly intuitive, help prevent you from making bugs, and show that there are cases where a single type has more than one instance of a typeclass.

Solution 4

IMPORTANT: See Will Ness' comment below.

Your question implies that the order doesn't matter. (But since the lists may be infinite, the order may be more important than you think!) Anyway, if the order doesn't matter, and you have encountered list comprehensions, that's one approach you could use. Here's an example.

λ> let xs = "abcdef"
λ> let ys = "ABCDEFGHI"
λ> [(x,y) | x <- xs, y <- ys]
[('a','A'),('a','B'),('a','C'),('a','D'),('a','E'),('a','F'),('a','G'),('a','H'),('a','I'),('b','A'),('b','B'),('b','C'),('b','D'),('b','E'),('b','F'),('b','G'),('b','H'),('b','I'),('c','A'),('c','B'),('c','C'),('c','D'),('c','E'),('c','F'),('c','G'),('c','H'),('c','I'),('d','A'),('d','B'),('d','C'),('d','D'),('d','E'),('d','F'),('d','G'),('d','H'),('d','I'),('e','A'),('e','B'),('e','C'),('e','D'),('e','E'),('e','F'),('e','G'),('e','H'),('e','I'),('f','A'),('f','B'),('f','C'),('f','D'),('f','E'),('f','F'),('f','G'),('f','H'),('f','I')]

Note that all of the tuples involving 'a' are printed first, then those involving 'b', and so on. Why does that matter? Well suppose the list is infinite. A query like this will return instantly:

(1,'a') `elem` [(x,y) | x <- [1..], y <- ['a'..]]

But one like this will take a LOOOOONG time:

(200000,'a') `elem` [(x,y) | x <- [1..], y <- ['a'..]]

If order matters, or you haven't encountered list comprehensions, or don't want to use them, luqui's approach is probably what you're looking for.

Share:
10,454
lopezrican304
Author by

lopezrican304

Updated on July 28, 2022

Comments

  • lopezrican304
    lopezrican304 almost 2 years

    I am trying to implement a function (described below) that takes two lists (each or both may be infinite) and return a list of tuples of all the possible pairs of elements between the lists

    zipInf :: [a] -> [b] -> [(a,b)]
    

    (e.g the output should be like this, but doesn't have to be exactly like this)

    zipInf [0 .. 2] ['A' .. 'C'] ~> [(0,'A'),(1,'A'),(0,'B'),(1,'B'),(0,'C'),(2,'A'),(2,'B'),(1,'C'),(2,'C')]
    
    zipInf [] [0 ..] ~> []
    
    zipInf [0 ..] [] ~> []
    
    take 9 (zipInf ['A'] [0 .. ]) ~> [('A',0),('A',1),('A',2),('A',3),('A',4),('A',5),('A',6),('A',7),('A',8)]
    

    I've started implementing it like this:

    zipInf :: [a] -> [b] -> [(a,b)]
    zipInf [] _ = []
    zipInf _ [] = []
    zipInf
    

    I wanted to feed the list into a helper function to produce the lists but the one I made fails to compile and don't know how to handle infinite lists

    Helper function-

    oneList :: [a] -> [b] [(a,b)]
    oneList [] _ = []
    oneList x:xs y:ys = [(x,y)] ++ oneList
    
  • Will Ness
    Will Ness about 11 years
    and how long will ((2,'a') `elem` ...) take? (assuming the infinite Character type of course (which it seems to be indeed)).
  • mhwombat
    mhwombat about 11 years
    Good point. If either list is infinite, then it takes forever.
  • luqui
    luqui about 11 years
    Only the second list being infinite causes problems.
  • luqui
    luqui almost 11 years
    @WillNess, Char has a lot of values, but it's not infinite. (maxBound :: Char) --> '\1114111'.
  • Will Ness
    Will Ness about 10 years
    asAppliedTo h x = let _=h x in h?
  • Will Ness
    Will Ness about 10 years
    ... but, Prelude Control.Applicative> take 10 $ pure (,) <*> [1..] <*> [100,200..] ==> [(1,100),(1,200),(1,300),(1,400),(1,500),(1,600),(1,700),(1,‌​800),(1,900),(1,1000‌​)]...
  • George Co
    George Co almost 6 years
    The question says the function should work for infinite lists, but some of these don't, e.g. take 5 $ (,) <$> [0..] <*> [0..] returns [(0,0),(0,1),(0,2),(0,3),(0,4)]
  • Will Ness
    Will Ness almost 4 years
    no, this isn't right. see this answer above. also, search for "dovetailing" or "interleaving". and "diagonals" or "diagonalization" like e.g. here. also this answer of mine.