Remove duplicate tuples from a list if they are exactly the same including order of items

python list duplicates tuples itertools

46,172

Solution 1

set will take care of that:

>>> a = [(1,2,2), (2,2,1), (1,2,2), (4,3,5), (3,3,5), (3,3,5), (3,4,5)]
>>> set(a)
set([(1, 2, 2), (2, 2, 1), (3, 4, 5), (3, 3, 5), (4, 3, 5)])
>>> list(set(a))
[(1, 2, 2), (2, 2, 1), (3, 4, 5), (3, 3, 5), (4, 3, 5)]
>>>

set will remove only exact duplicates.

Solution 2

What you need is unique permutations rather than combinations:

y = list(set(itertools.permutations(x,3)))

That is, (1,2,2) and (2,1,2) will be considered as same combination and only one of them will be returned. They are, however, different permutations. Use set() to remove duplicates.

If afterwards you want to sort elements within each tuple and also have the whole list sorted, you can do:

y = [tuple(sorted(q)) for q in y]
y.sort()

Solution 3

You were really close. Just get permutations, not combinations. Order matters in permutations, and it does not in combinations. Thus (1, 2, 2) is a distinct permutation from (2, 2, 1). However (1, 2, 2) is considered a singular combination of one 1 and two 2s. Therefore (2, 2, 1) is not considered a distinct combination from (1, 2, 2).

You can convert your list y to a set so that you remove duplicates...

import itertools

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = []

for x in itertools.permutations(x,3):
    y.append(x)
print(set(y))

And voila, you are done. :)

Solution 4

This will probably do what you want, but it's vast overkill. It's a low-level prototype for a generator that may be added to itertools some day. It's low level to ease re-implementing it in C. Where N is the length of the iterable input, it requires worst-case space O(N) and does at most N*(N-1)//2 element comparisons, regardless of how many anagrams are generated. Both of those are optimal ;-)

You'd use it like so:

>>> x = [1,1,1,2,2,2,3,3,3,4,4,5]
>>> for t in anagrams(x, 3):
...     print(t)
(1, 1, 1)
(1, 1, 2)
(1, 1, 3)
(1, 1, 4)
(1, 1, 5)
(1, 2, 1)
...

There will be no duplicates in the output. Note: this is Python 3 code. It needs a few changes to run under Python 2.

import operator

class ENode:
    def __init__(self, initial_index=None):
        self.indices = [initial_index]
        self.current = 0
        self.prev = self.next = self

    def index(self):
        "Return current index."
        return self.indices[self.current]

    def unlink(self):
        "Remove self from list."
        self.prev.next = self.next
        self.next.prev = self.prev

    def insert_after(self, x):
        "Insert node x after self."
        x.prev = self
        x.next = self.next
        self.next.prev = x
        self.next = x

    def advance(self):
        """Advance the current index.

        If we're already at the end, remove self from list.

        .restore() undoes everything .advance() did."""

        assert self.current < len(self.indices)
        self.current += 1
        if self.current == len(self.indices):
            self.unlink()

    def restore(self):
        "Undo what .advance() did."
        assert self.current <= len(self.indices)
        if self.current == len(self.indices):
            self.prev.insert_after(self)
        self.current -= 1

def build_equivalence_classes(items, equal):
    ehead = ENode()
    for i, elt in enumerate(items):
        e = ehead.next
        while e is not ehead:
            if equal(elt, items[e.indices[0]]):
                # Add (index of) elt to this equivalence class.
                e.indices.append(i)
                break
            e = e.next
        else:
            # elt not equal to anything seen so far:  append
            # new equivalence class.
            e = ENode(i)
            ehead.prev.insert_after(e)
    return ehead

def anagrams(iterable, count=None, equal=operator.__eq__):
    def perm(i):
        if i:
            e = ehead.next
            assert e is not ehead
            while e is not ehead:
                result[count - i] = e.index()
                e.advance()
                yield from perm(i-1)
                e.restore()
                e = e.next
        else:
            yield tuple(items[j] for j in result)

    items = tuple(iterable)
    if count is None:
        count = len(items)
    if count > len(items):
        return

    ehead = build_equivalence_classes(items, equal)
    result = [None] * count
    yield from perm(count)

Solution 5

No need to do for loop, combinations gives a generator.

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = list(set(itertools.combinations(x,3)))

View more solutions

46,172

5813

Updated on July 09, 2022

Comments

5813 almost 2 years
I know questions similar to this have been asked many, many times on Stack Overflow, but I need to remove duplicate tuples from a list, but not just if their elements match up, their elements have to be in the same order. In other words, (4,3,5) and (3,4,5) would both be present in the output, while if there were both(3,3,5) and (3,3,5), only one would be in the output.

Specifically, my code is:
```
import itertools

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = []

for x in itertools.combinations(x,3):
    y.append(x)
print(y)
```
of which the output is quite lengthy. For example, in the output, there should be both (1,2,1) and (1,1,2). But there should only be one (1,2,2).
- sashkello over 10 years
  
  I think you need permutations rather than combinations?
- Tim Peters over 10 years
  
  You cannot get (2, 2, 1) from combinations(), given that input. You'll need to use permutations() instead.
- 5813 over 10 years
  
  Sorry for the typos, my mistake. But I tried using permutations() and did not get the desired result. I don't want (2,2,1) anywhere in the output. I've revised the order of items in my list so as to give the tuples I want in the order I want, but the question stands.
- sashkello over 10 years
  
  @5813 combinations definitely will not give you the desired result because it won't include (1,2,1) and (1,1,2). Ordering things is just a matter of sorting which can be done afterwards - please see my answer.
AChampion over 10 years

You could also just make y a set and not need to convert it, i.e. y = set(); ... ; y.add(x)
TankorSmash over 10 years

Could someone explain the downvote? I'd happily improve my answer.
Ladmerc about 7 years

Bear in mind that set could change the order of the list itself
pylang almost 6 years

In Python 3.6+, list(dict.from_key(a)) to maintain order.
Raj Mehta over 5 years

@pylang: I am getting an error message if I am using your solution. "type object 'dict' has no attribute 'from_key'". I need to sort the list of tuples and also maintain the order. I am using Python 3.6.5.
pylang over 5 years

dict.fromkeys(). Pardon. Specifically, list(dict.fromkeys(a))
Dipen Gajjar over 4 years

But it will not keep the order, for the simplest and exact solution check out this answer from @roadrunner stackoverflow.com/questions/48300501/…