Why is there no tuple comprehension in Python?

162,675

Solution 1

You can use a generator expression:

tuple(i for i in (1, 2, 3))

but parentheses were already taken for … generator expressions.

Solution 2

Raymond Hettinger (one of the Python core developers) had this to say about tuples in a recent tweet:

#python tip: Generally, lists are for looping; tuples for structs. Lists are homogeneous; tuples heterogeneous. Lists for variable length.

This (to me) supports the idea that if the items in a sequence are related enough to be generated by a, well, generator, then it should be a list. Although a tuple is iterable and seems like simply a immutable list, it's really the Python equivalent of a C struct:

struct {
    int a;
    char b;
    float c;
} foo;

struct foo x = { 3, 'g', 5.9 };

becomes in Python

x = (3, 'g', 5.9)

Solution 3

Since Python 3.5, you can also use splat * unpacking syntax to unpack a generator expresion:

*(x for x in range(10)),

Solution 4

As another poster macm mentioned, the fastest way to create a tuple from a generator is tuple([generator]).


Performance Comparison

  • List comprehension:

    $ python3 -m timeit "a = [i for i in range(1000)]"
    10000 loops, best of 3: 27.4 usec per loop
    
  • Tuple from list comprehension:

    $ python3 -m timeit "a = tuple([i for i in range(1000)])"
    10000 loops, best of 3: 30.2 usec per loop
    
  • Tuple from generator:

    $ python3 -m timeit "a = tuple(i for i in range(1000))"
    10000 loops, best of 3: 50.4 usec per loop
    
  • Tuple from unpacking:

    $ python3 -m timeit "a = *(i for i in range(1000)),"
    10000 loops, best of 3: 52.7 usec per loop
    

My version of python:

$ python3 --version
Python 3.6.3

So you should always create a tuple from a list comprehension unless performance is not an issue.

Solution 5

Comprehension works by looping or iterating over items and assigning them into a container, a Tuple is unable to receive assignments.

Once a Tuple is created, it can not be appended to, extended, or assigned to. The only way to modify a Tuple is if one of its objects can itself be assigned to (is a non-tuple container). Because the Tuple is only holding a reference to that kind of object.

Also - a tuple has its own constructor tuple() which you can give any iterator. Which means that to create a tuple, you could do:

tuple(i for i in (1,2,3))
Share:
162,675
Shady Xu
Author by

Shady Xu

Life is rough, you gotta be tough. Life is short, you gotta use python.

Updated on November 21, 2021

Comments

  • Shady Xu
    Shady Xu over 2 years

    As we all know, there's list comprehension, like

    [i for i in [1, 2, 3, 4]]
    

    and there is dictionary comprehension, like

    {i:j for i, j in {1: 'a', 2: 'b'}.items()}
    

    but

    (i for i in (1, 2, 3))
    

    will end up in a generator, not a tuple comprehension. Why is that?

    My guess is that a tuple is immutable, but this does not seem to be the answer.

  • mgilson
    mgilson almost 11 years
    In some ways I agree (about it not being necessary because a list will do), but in other ways I disagree (about the reasoning being because it's immutable). In some ways, it makes more sense to have a comprehension for immutable objects. who does lst = [x for x in ...]; x.append()?
  • Inbar Rose
    Inbar Rose almost 11 years
    @mgilson I am not sure how that relates to what I said?
  • mgilson
    mgilson almost 11 years
    By this argument, we could say a list-comprehension is unnecessary too: list(i for i in (1,2,3)). I really think it's simply because there isn't a clean syntax for it (or at least nobody has thought of one)
  • Martijn Pieters
    Martijn Pieters almost 11 years
    A list or set or dict comprehension is just syntactic sugar to use a generator expression that outputs a specific type. list(i for i in (1, 2, 3)) is a generator expression that outputs a list, set(i for i in (1, 2, 3)) outputs a set. Does that mean the comprehension syntax is not needed? Perhaps not, but it is awfully handy. For the rare cases you need a tuple instead, the generator expression will do, is clear, and doesn't require the invention of another brace or bracket.
  • Charles Salvia
    Charles Salvia almost 11 years
    The answer is obviously because tuple syntax and parenthesis are ambiguous
  • JKillian
    JKillian over 9 years
    @MartijnPieters Wanted to point out that list(generator expression) isn't exactly the same as [generator expression]: thecodingforums.com/threads/…
  • Martijn Pieters
    Martijn Pieters over 9 years
    @JKillian: the difference exists but is too subtle for the vast majority to have to care about. Playing with an iterator like that in the expression without handling the StopIterator exception is going to be rare enough. :-)
  • JKillian
    JKillian over 9 years
    @MartijnPieters Good point, although I stumbled across it the other day while trying to write my own zip function as an exercise to learn Python. Needless to say, it confused the heck out of me why tuple(gen exp) failed but tuple([gen exp]) worked perfectly. Anyways, good to have it noted here that StopIteration will be swallowed by generator expressions/comprehensions but will propagate out of list/set/dictionary comprehensions.
  • Scott
    Scott about 9 years
    @mgilson if a tuple is immutable that means the underlying implementation cannot "generate" a tuple ("generation" implying building one piece at a time). immutable means you can't build the one with 4 pieces by altering the one with 3 pieces. instead, you implement tuple "generation" by building a list, something designed for generation, then build the tuple as a last step, and discard the list. The language reflects this reality. Think of tuples as C structs.
  • pavon
    pavon over 8 years
    The immutibility property can be important though and often a good reason to use a tuple when you would normally use a list. For example, if you have a list of 5 numbers that you want to use as a key to a dict, then tuple is the way to go.
  • dave
    dave almost 8 years
    Thats a nice tip from Raymond Hettinger. I would still say there is a use case for using the tuple constructor with a generator, such as unpacking another structure, perhaps larger, into a smaller one by iterating over the attrs that you are interested in converting to a tuple record.
  • chepner
    chepner almost 8 years
    @dave You can probably just use operator.itemgetter in that case.
  • dave
    dave almost 8 years
    @chepner, I see. That is pretty close to what I mean. It does return a callable so if I only need to do it once I don't see much of a win vs just using tuple(obj[item] for item in items) directly. In my case I was embedding this into a list comprehension to make a list of tuple records. If I need to do this repeatedly throughout the code then itemgetter looks great. Perhaps itemgetter would be more idiomatic either way?
  • Justin Turner Arthur
    Justin Turner Arthur over 7 years
    The difference between using a comprehension and using a constructor+generator is more than subtle if you care about performance. Comprehensions result in faster construction compared to using a generator passed to a constructor. In the latter case you are creating and executing functions and functions are expensive in Python. [thing for thing in things] constructs a list much faster than list(thing for thing in things). A tuple comprehension would not be useless; tuple(thing for thing in things) has latency issues and tuple([thing for thing in things]) could have memory issues.
  • uchuugaka
    uchuugaka over 6 years
    although it would be reasonable for the syntactic sugar of comprehensions to work for tuples, since you cannot use the tuple until the comprehension is returned. Effectively it does not act like mutable, rather a tuple comprehension could behave much like string appending.
  • uchuugaka
    uchuugaka over 6 years
    Angle brackets unused.
  • mgilson
    mgilson over 6 years
    @uchuugaka -- Not completely. They're used for comparison operators. It could probably still be done without ambiguity, but maybe not worth the effort ...
  • uchuugaka
    uchuugaka over 6 years
    No, that is very different. The language grammar is very specific about those as tokens. They have a very clear semantic and lexical scope that would be unambiguous (or nearly enough to make it work with minor changes) if also applied to new tokens like exist for the other brackets. Parens are already used to delineate scope of lots of different things, so it should be very very doable. Were it only that somebody decided to. Honestly, sets also need some literal love and comprehensions should get sigils.
  • felixphew
    felixphew over 6 years
    This is great (and it works), but I can't find anywhere it's documented! Do you have a link?
  • polyglot
    polyglot over 6 years
    I see the relationship between frozenset and set analogous to that of tuple and list. It's less about heterogeneity and more about immutability - frozensets and tuples can be keys to dictionaries, lists and sets cannot due to their mutability.
  • Admin
    Admin about 6 years
    @uchuugaka Worth noting that {*()}, though ugly, works as an empty set literal!
  • uchuugaka
    uchuugaka about 6 years
    Absolutely HIDEOUS. I like it, but only because it is sick obscure stuff. Nobody should ever use that. Awesome.
  • uchuugaka
    uchuugaka about 6 years
    What version of python is that supposed to work in ?
  • Tom
    Tom almost 6 years
    There's also a common case where you use a generator to produce a struct-like thing: where you're processing text records such as CSV. This is often written line_values = tuple(int(x.trim()) for x in line.split(',')). As others have noted, using the tuple constructor here instead of a comprehension has performance implications, and parsing large datasets of this type is a case where you really care about performance.
  • jpp
    jpp over 5 years
    @MartijnPieters, Can you potentially reword A list or set or dict comprehension is just syntactic sugar to use a generator expression? It's causing confusion by people seeing these as equivalent means to an end. It's not technically syntactic sugar as the processes are actually different, even if the end product is the same.
  • Martijn Pieters
    Martijn Pieters over 5 years
    @jpp: that's in a comment, not in my answer. Comments are generally not editable. Technically I can edit mine still, but only because I am a moderator. And I stand by my comment, as the *syntax is very close. Decorators are also syntactic sugar, and their implementation differs in important ways from the syntax they replaced, so this is not an isolated example. I am not convinced that one example of confusion equals general confusion.
  • jpp
    jpp over 5 years
  • Quantum Mechanic
    Quantum Mechanic over 5 years
    @M.I.Wright, I'd call that the Cyclops (sideways). Does it have a name?
  • Admin
    Admin over 5 years
    @QuantumMechanic Nope, no common name -- likely because it's not often used (and shouldn't be at all used!). From a purely-aesthetic standpoint, though, I admit I'm somewhat partial now to {*''}
  • mgilson
    mgilson over 5 years
    Ugh. From an aesthetic standpoint, I think I'm partial to set() :)
  • ShadowRanger
    ShadowRanger over 5 years
    Note: tuple of listcomp requires a peak memory usage based on the combined size of the final tuple and list. tuple of a genexpr, while slower, does mean you only pay for the final tuple, no temporary list (the genexpr itself occupying roughly fixed memory). Usually not meaningful, but it can be important when the sizes involved are huge.
  • ShadowRanger
    ShadowRanger over 5 years
    @QuantumMechanic: I came up with {*()} almost immediately after PEP 448 came out, and I've been calling it the one-eyed monkey operator. I doubt it's the only name people have come up with.
  • ShadowRanger
    ShadowRanger over 5 years
    Note: As an implementation detail, this is basically the same as doing tuple(list(x for x in range(10))) (the code paths are identical, with both of them building a list, with the only difference being that the final step is to create a tuple from the list and throw away the list when a tuple output is required). Means that you don't actually avoid a pair of temporaries.
  • Quantum Mechanic
    Quantum Mechanic over 5 years
    @ShadowRanger: It turns out that these all evaluate to the empty set: {*''}, {*""}, {*()}, {*[]}, {*{}}. So inadvertently, TIMTOWTDI. I guess I like {*[]} for it's appearance as a posh Letterbox in Kent.
  • ShadowRanger
    ShadowRanger over 5 years
    @QuantumMechanic: Yeah, that's the point; the unpacking generalizations made the empty "set literal" possible. Note that {*[]} is strictly inferior to the other options; the empty string and empty tuple, being immutable, are singletons, so no temporary is needed to construct the empty set. By contrast, the empty list is not a singleton, so you actually have to construct it, use it to build the set, then destroy it, losing whatever trivial performance advantage the one-eyed monkey operator provides.
  • Lucubrator
    Lucubrator over 5 years
    To expand on the comment of @ShadowRanger, here's a question where they show that the splat+tuple literal syntax is actually quite a bit slower than passing a generator expression to the tuple constructor.
  • Ryan H.
    Ryan H. over 4 years
    I'm trying this in Python 3.7.3 and *(x for x in range(10)) doesn't work. I get SyntaxError: can't use starred expression here. However tuple(x for x in range(10)) works.
  • czheo
    czheo over 4 years
    @RyanH. you need put a comma in the end.
  • mins
    mins about 3 years
    "Also a tuple comprehension is never necessary, a list can just be used instead with negligible speed differences" Calling C++ libraries with a list instead of a tuple may return an error. However it's not that difficult to convert the list into a tuple by tuple(list)
  • jamylak
    jamylak about 3 years
    @mins That appears to be the best option you can choose from here stackoverflow.com/a/48592299/1219006 based on timing
  • jamylak
    jamylak about 3 years
    Very informative. Tuple from a generator would not be the best choice in this case. I think tuple([i for i in range(1000)]) is the best in terms of readability and speed. Though ofc, not sure of the timings on smaller / bigger / different datasets
  • Utsav Patel
    Utsav Patel about 3 years
    when I tried tuple from list comprehension v/s tuple from generator with bigger data (roughly say range(1_000_000)) you'll see tuple from generator will take less time although it's not so significant but you'll end up saving both size and time in case of bigger data
  • mahee96
    mahee96 over 2 years
    Just to add to comments, this is just a tuple expansion, since the tuple contains a generator expression, the <genexpr> gets expanded/unpacked while displaying. since tuple can be written without parentheses but just a comma, it works
  • mahee96
    mahee96 over 2 years
    ex: *range(10), NOTE: The comma at the end to indicate it is a tuple
  • mahee96
    mahee96 over 2 years
    NOTE: tuple(x for x in range(10)) works coz the inner expression is a generator expression and it first returns an iterator, which is a single argument for the tuple, so the tuple is happy and uses the iterator of the <genexpr> to iterate over the items and create the final tuple. But for tuple(*range(10)) the range <genexpr> is unpacked and tuple constructor receives 10 items which it doesn't like and says: tuple expected at most 1 arguments, got 10. so the correct syntax for constructor is tuple(<genexpr>). Hence using tuple(range(10)) works.
  • markusk
    markusk over 2 years
    one = (two,) and one = tuple(two) do not evaluate to the same value. The argument to tuple must be an iterator. one = (two,) is equivalent with one = tuple(i for i in two), one = tuple((two,)), and one = tuple([two]).
  • Ryan
    Ryan about 2 years
    What you do when (1,2,3) isn't easy enough.
  • pabouk - Ukraine stay strong
    pabouk - Ukraine stay strong almost 2 years
    I did not see the speed advantage. Did you try to repeat the timing multiple times? The results can vary considerably on repeated runs.
  • B.R.
    B.R. almost 2 years
    I just checked on python 3.5 and I could reproduce it. But this might be different for other python versions. It seams to be somehow plausible because deque does not need the index related overhead a list has.
  • pabouk - Ukraine stay strong
    pabouk - Ukraine stay strong almost 2 years
    Just for information: in Python 3.10.4 I am getting values around 6 for the list variant, 8 for the deque variant. The variation between individual runs is bigger than the 0.3 seconds difference between your results. I am running it inside WSL2 and the virtualization can possibly cause the large variation.
  • B.R.
    B.R. almost 2 years
    I rechecked now with python 3.9.2 on same computer like before and I got: First case: 5.848282200000003 and second case: 6.6902867000000015 I guess the implementation related to list is more improved compared with deque This means my statement is only valid for older python versions