When is it better to use zip instead of izip?

64,765

Solution 1

When you know you'll want the full list of items constructed (for instance, for passing to a function that would modify that list in-place). Or when you want to force the arguments you're passing to zip() to be completely evaluated at that specific point.

Solution 2

zip computes all the list at once, izip computes the elements only when requested.

One important difference is that 'zip' returns an actual list, 'izip' returns an 'izip object', which is not a list and does not support list-specific features (such as indexing):

>>> l1 = [1, 2, 3, 4, 5, 6]
>>> l2 = [2, 3, 4, 5, 6, 7]
>>> z = zip(l1, l2)
>>> iz = izip(l1, l2)
>>> isinstance(zip(l1, l2), list)
True
>>> isinstance(izip(l1, l2), list)
False
>>> z[::2] #Get odd places
[(1, 2), (3, 4), (5, 6)]
>>> iz[::2] #Same with izip
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'itertools.izip' object is unsubscriptable

So, if you need a list (an not a list-like object), just use 'zip'.

Apart from this, 'izip' can be useful for saving memory or cycles.

E.g. the following code may exit after few cycles, so there is no need to compute all items of combined list:

lst_a = ... #list with very large number of items
lst_b = ... #list with very large number of items
#At each cycle, the next couple is provided
for a, b in izip(lst_a, lst_b):
    if a == b:
        break
print a

using zip would have computed all (a, b) couples before entering the cycle.

Moreover, if lst_a and lst_b are very large (e.g. millions of records), zip(a, b) will build a third list with double space.

But if you have small lists, maybe zip is faster.

Solution 3

The itertools library provides "iterators" for common Python functions. From the itertools docs, "Like zip() except that it returns an iterator instead of a list." The I in izip() means "iterator".

Python iterators are a "lazy loaded" sequence that saves memory over regular in-memory list. So, you would use itertools.izip(a, b) when the two inputs a, b are too big to keep in memory at one time.

Look up the Python concepts related to efficient sequential processing:

"generators" & "yield"
"iterators"
"lazy loading"

Solution 4

In 2.x, when you need a list instead of an iterator.

Share:
64,765

Related videos on Youtube

Neil G
Author by

Neil G

Interested in machine learning and Python.

Updated on September 30, 2020

Comments

  • Neil G
    Neil G over 3 years

    When is it better to use zip instead of itertools.izip?

    • Causality
      Causality almost 8 years
      One reason in favor of zip , too obvious yet still worth pointing out, is that izip returns an iterator which can be traversed only once. i.e. in ii = izip(a,b) ; f(ii) ; g(ii), here an empty list [] is passed to g.
    • Charles L.
      Charles L. over 5 years
      FYI, Python 3's zip function is Python 2's izip. In general Python 3 changed most functions to use iterators, like range, filter, the dict functions, etc
  • Neil G
    Neil G about 13 years
    Can you give me an example where that might happen?
  • Ignacio Vazquez-Abrams
    Ignacio Vazquez-Abrams about 13 years
    Not really. Which is why I tend to prefer itertools.izip() except where the gains would be purely statistical.
  • Don
    Don about 13 years
    You're right. I started with good intentions and then fell into theoretical stuff...
  • Jan Vlcinsky
    Jan Vlcinsky over 10 years
    One case, when you need a list, is when you plan to access items of the result by index or need to find total length. lst = zip(lst_a, lst_b) allows lst[1] or len(lst). However, for ilst = itertools.izip(lst_a, lst_n) you will fail trying to ilst[1] or len(ilst).
  • user1815201
    user1815201 over 10 years
    Would it not be better to use izip in the first case as its faster since it reuses the tuple and there's no real reason not to use izip?
  • ShadowRanger
    ShadowRanger over 8 years
    @user1815201: izip only reuses the tuple if the tuple was released before the next iteration begins, so it doesn't gain you anything. That said, any loss is trivial too, so I agree that there is little reason not to use izip exclusively, wrapping with list if you need a list; you can actually do this the "proper" way by adding from future_builtins import zip to Py2 code, which makes plain zip into izip (preparing for Py3 transition).
  • Rahul
    Rahul over 7 years
    Nicely explained.