Transpose/Unzip Function (inverse of zip)?

935

Solution 1

zip is its own inverse! Provided you use the special * operator.

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

The way this works is by calling zip with the arguments:

zip(('a', 1), ('b', 2), ('c', 3), ('d', 4))

… except the arguments are passed to zip directly (after being converted to a tuple), so there's no need to worry about the number of arguments getting too big.

Solution 2

You could also do

result = ([ a for a,b in original ], [ b for a,b in original ])

It should scale better. Especially if Python makes good on not expanding the list comprehensions unless needed.

(Incidentally, it makes a 2-tuple (pair) of lists, rather than a list of tuples, like zip does.)

If generators instead of actual lists are ok, this would do that:

result = (( a for a,b in original ), ( b for a,b in original ))

The generators don't munch through the list until you ask for each element, but on the other hand, they do keep references to the original list.

Solution 3

I like to use zip(*iterable) (which is the piece of code you're looking for) in my programs as so:

def unzip(iterable):
    return zip(*iterable)

I find unzip more readable.

Solution 4

If you have lists that are not the same length, you may not want to use zip as per Patricks answer. This works:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

But with different length lists, zip truncates each item to the length of the shortest list:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e')]

You can use map with no function to fill empty results with None:

>>> map(None, *[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, None)]

zip() is marginally faster though.

Solution 5

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

Gives a tuple of lists as in the question.

list1, list2 = [list(tup) for tup in zip(*original)]

Unpacks the two lists.

Share:
935
the berserker
Author by

the berserker

Updated on July 11, 2022

Comments

  • the berserker
    the berserker almost 2 years

    I am trying use MySql and Entity Framework, using Connector/Net 6.1 with this as a reference:

    http://dev.mysql.com/doc/refman/5.4/en/connector-net-tutorials-entity-framework-winform-data-source.html

    However my project is WebApplication instead of WinForms. I have sucessfully created entities, but I am not able to cerate Data Source for WebApplication (or MVC), since create command under menus and Data Sources window is missing (window says: There are no data sources to show for the selected project). However I can do it for WinForms/Console application.

    I can't figure out why web projects don't allow me to create the Data Sources. What am I missing?

  • glglgl
    glglgl almost 13 years
    "Especially if Python makes good on not expanding the list comprehensions unless needed." mmm... normally, list comprehensions are expanded immediately - or do I get something wrong?
  • Anders Eurenius
    Anders Eurenius over 11 years
    @glglgl: No,you're probably right. I was just hoping some future version might start doing the right thing. (It's not impossible to change, the side-effect semantics that need changes are probably already discouraged.)
  • glglgl
    glglgl over 11 years
    What you hope to get is a generator expresion - which exists already.
  • Anders Eurenius
    Anders Eurenius over 11 years
    No, what I hope to get is the perennial favourite "a sufficiently smarter compiler" (or interpreter in this case). I don't think there's anything sensible that would be broken by analysing the bejeebus out of the code and doing something wildly different. (like making a lazy collection) Python has never promised this feature, and will most likely never have it, but I can see that dream in the design.
  • Marcin
    Marcin over 10 years
    You could also use izip_longest
  • habnabit
    habnabit over 10 years
    This does not 'scale better' than the zip(*x) version. zip(*x) only requires one pass through the loop, and does not use up stack elements.
  • user2357112
    user2357112 about 10 years
    Oh, if only it were so simple. Unzipping zip([], []) this way does not get you [], []. It gets you []. If only...
  • Chris Hagmann
    Chris Hagmann about 10 years
    @user2357112 it give you zip(*zip([list1], [list2])) gives you ([list1, list2]).
  • user2357112
    user2357112 about 10 years
    @cdhagmann: zip([list1], [list2]) is never what you want, though. That just gives you [(list1, list2)].
  • Chris Hagmann
    Chris Hagmann about 10 years
    @user2357112 I was using [list1] to mean any list named list1 and not as a list with a list with only one list as an entry. So given list1 = [1,2,3,4] and list2 = [1,2,3,4] then zip(*zip(list1, list2)) gives you ([1,2,3,4],[1,2,3,4])
  • JuanPi
    JuanPi almost 10 years
    @cdhagmann you get [(1, 2, 3, 4), (1, 2, 3, 4)] from your commands.
  • Tommy
    Tommy almost 10 years
    This does not work in Python3. See: stackoverflow.com/questions/24590614/…
  • denfromufa
    denfromufa over 9 years
    zip does not preserve elements in longer iterables, hence padding is required
  • trss
    trss over 9 years
    tuple(map(list, zip(*original))) to get precisely the mentioned result.
  • MJeffryes
    MJeffryes about 9 years
    @Tommy This is incorrect. zip works exactly the same in Python 3 except that it returns an iterator instead of a list. In order to get the same output as above you just need to wrap the zip call in a list: list(zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])) will output [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
  • Ekevoo
    Ekevoo over 8 years
    Whether it "scales better" or not depends of the lifecycle of the original data compared to the transposed data. This answer is only better than using zip if the use-case is that the transposed data is used and discarded immediately, while the original lists stay in memory for much longer.
  • zezollo
    zezollo about 8 years
    Known as zip_longest for python3 users.
  • Laurent LAPORTE
    Laurent LAPORTE over 7 years
    notice: you can meet memory and performance issues with very long lists.
  • cactus1
    cactus1 almost 7 years
    @GrijeshChauhan I know this is really old, but it's a weird built in feature: docs.python.org/2/library/functions.html#map "If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list."
  • Neil G
    Neil G about 6 years
    This answer provides much better error reporting when the data is misshapen.
  • Charlie Clark
    Charlie Clark over 5 years
    Continually recreating tuples doesn't seem that efficient to me but you could extend this approach using deques which could preallocate memory.
  • John P
    John P over 5 years
    It's probably a discussion for another thread, but what should you use if not lists? Should I be more concerned with quantifying "very long", and choosing or changing structures if they seem close, or preemptively using different structures for data that has the potential to scale?
  • ShadowRanger
    ShadowRanger over 4 years
    @JohnP: lists are fine. But if you try to realize the full result all at once (by listifying the result of zip), you might use a lot of memory (because all the tuples must be created at once). If you can just iterate over the result of zip without listifying, you'll save a lot of memory. The only other concern is if the input has many elements; the cost there is that it must unpack them all as arguments, and zip will need to create and store iterators for all of them. This is only a real problem with very long lists (think hundreds of thousands of elements or more).
  • YoungCoder5
    YoungCoder5 over 3 years
    This works in Python 3.9. Pastebin example here. I have to applaud Patrick's cleverness.
  • rusheb
    rusheb over 2 years
    I thnk this is the most accurate answer because, as the question asks, it actually returns a pair of lists (rather than a list of tuples).
  • mkearney
    mkearney about 2 years
    this should be the top answer. it's frustrating to see the other ones that are currently considered 'top'