Count the duplicates in a list of tuples

11,630

Solution 1

use collections library. In the following code val_1, val_2 give you duplicates of each first elements and second elements of the tuples respectively.

import collections
val_1=collections.Counter([x for (x,y) in a])
val_2=collections.Counter([y for (x,y) in a])

>>> print val_1
<<< Counter({1: 3, 2: 1, 6: 1})

This is the number of occurrences of the first element of each tuple

>>> print val_2
<<< Counter({2: 2, 9: 1, 4: 1, 7: 1})

This is the number of occurrences of the second element of each tuple

Solution 2

You can use a Counter

from collections import Counter
a = [(1,2),(1,4),(1,2),(6,7),(2,9)]
counter=Counter(a)
print counter

This will output:

Counter({(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1})

It is a dictionary like object with the item (tuples in this case) as the key and a value containing the number of times that key was seen. Your (1,2) tuple is seen twice, while all others are only seen once.

>>> counter[(1,2)]
2

If you are interested in each individual portion of the tuple, you can utilize the same logic for each element in the tuple.

first_element = Counter([x for (x,y) in a])
second_element = Counter([y for (x,y) in a])

first_element and second_element now contain a Counter of the number of times values are seen per element in the tuple

>>> first_element
Counter({1: 3, 2: 1, 6: 1})
>>> second_element
Counter({2: 2, 9: 1, 4: 1, 7: 1})

Again, these are dictionary like objects, so you can check how frequent a specific value appeared directly:

>>> first_element[2]
1

In the first element of your list of tuples, the value 2 appeared 1 time.

Solution 3

You can make count_map, and store the count of each tuple as the value.

>>> count_map = {}
>>> for t in a:
...     count_map[t] = count_map.get(t, 0)  +1
... 
>>> count_map
{(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1}

Solution 4

Maybe Dictionary can work better. Because in your code, you are traveling the list for twice. And this makes the complexity of your code O(n^2). And this is not a good thing :)

Best way is the travelling for once and to use 1 or 2 conditions for each traverse. Here is the my first solution for such kind of problem.

a = [(1,2),(1,4),(1,2),(6,7),(2,9)]

dict = {}
for (i,j) in a:
    if dict.has_key(i):
            dict[i] += 1
    else:
            dict[i] = 1

print dict

For this code, this will give the output:

{1: 3, 2: 1, 6: 1}

I hope it will be helpful.

Solution 5

Using pandas this is simple and very fast:

import pandas
print(pandas.Series(data=[(1,2),(1,4),(1,2),(6,7),(2,9)]).value_counts())

(1, 2)    2
(1, 4)    1
(6, 7)    1
(2, 9)    1
dtype: int64
Share:
11,630
DimSarak
Author by

DimSarak

Updated on June 09, 2022

Comments

  • DimSarak
    DimSarak over 1 year

    I have a list of tuples: a = [(1,2),(1,4),(1,2),(6,7),(2,9)] I want to check if one of the individual elements of each tuple matches the same position/element in another tuple, and how many times this occurs.

    For example: If only the 1st element in some tuples has a duplicate, return the tuple and how many times it's duplicated. I can do that with the following code:

    a = [(1,2), (1,4), (1,2), (6,7), (2,9)]
    
    coll_list = []
    for t in a:
        coll_cnt = 0
        for b in a:
            if b[0] == t[0]:
                coll_cnt = coll_cnt + 1
        print "%s,%d" %(t,coll_cnt)
        coll_list.append((t,coll_cnt))
    
    print coll_list
    

    I want to know if there is a more effective way to do this?

  • jonrsharpe
    jonrsharpe over 8 years
    You can also use collections.defaultdict(int) to avoid the awkward get.
  • doru
    doru over 8 years
    The OP didn't want the number of occurrences of each tuple.
  • doru
    doru over 8 years
    The OP didn't want the number of occurrences of each tuple.
  • doru
    doru over 8 years
    The OP didn't want the number of occurrences of each tuple.
  • Sudipta
    Sudipta over 8 years
    @doru OP's code says otherwise. He/she has calculated counts of all tuples one by one, making it a N^2 operation.
  • doru
    doru over 8 years
    @Sudipta have you run the OP code to see the what are the desired results?
  • Rick
    Rick over 8 years
    You can use collections.defaultdict(int) to clean up the inside of the loop a bit
  • Sudipta
    Sudipta over 8 years
    @doru- Yes I have done that. OP code is for checking if only the 1st part of all tuples of the list has a dublicate. But now OP needs a code to find: if one of the elements of each tuple has a dublicate and how many times.
  • cengineer
    cengineer over 8 years
    Yeah, that would clear. But if I use collections, why try to make a loop, right? :) I'd prefer a solution like Andy's one ;)