Efficiently determine if two of three items in a list are the same

10,162

Solution 1

You can see how many unique values are there with a set. If there is one less item in the set than in the list, one was a duplicate:

def has1dup(lst):
    return len(lst)-1 == len(set(lst))

Solution 2

>>> from collection import Counter
>>> 2 in Counter(["one", "one", "two"]).values()
True
>>> 2 in Counter(["one", "two", "three"]).values()
False

update
If you want there are only two same items

Counter(seq).values().count(2) == 1

The Counter works for Python 2.7+, for lower versions you could do it manually

def counter(seq): 
    r = {}
    for x in seq:
        r[x] = r.setdefault(x, 0) + 1 # or defaultdict
    return r

Solution 3

You can do this very easily and nicely using the any() builtin:

def has_duplicates(seq):
    return any(seq.count(x) > 1 for x in seq)

Example:

>>> has_duplicates([1, 1, 2])
True
>>> has_duplicates([1, 2, 2])
True
>>> has_duplicates([1, 2, 3])
False

If you only want to find where two and only two items are the same, just change the condition:

any(seq.count(x) == 2 for x in seq)

If you want to find where there is one, and only one instance of two, and only two items, we can do that too, although it requires more work:

def any_n(iterable, n):
    seen = 0
    for value in iterable:
        if value:
            if seen >= n:
                return False
            else:
                seen += 1
    return seen == n

def has_one_value_repeated_n_times(seq, n):
    return any_n((seq.count(x) == n for x in seq), n)

Some quick tests:

tests = [
    [1,2,2,3,3,3,4,4,4,4,5,5,5,5,5],
    [1,2,2,3,3,4,4,4,4,5,5,5,5,5],
    [1,2,2],
    [1,1,2],
    [1,2,3],
]

for test in tests:
    print(test, "-", has_one_value_repeated_n_times(test, 2))

Giving us:

[1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5] - True
[1, 2, 2, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5] - False
[1, 2, 2] - True
[1, 1, 2] - True
[1, 2, 3] - False

Solution 4

2 in collections.Counter(yourList).values()

Short and efficient.

If you mean "exactly" as in "among the multiplicities of elements, there is exactly one element with multiplicity 2", then you do:

Counter(Counter(yourList).values()).get(2)==1
Share:
10,162

Related videos on Youtube

JSW189
Author by

JSW189

Updated on May 25, 2022

Comments

  • JSW189
    JSW189 almost 2 years

    What is the most efficient way to determine if exactly two elements in a list are the same? For example:

    >>> has1dup(["one", "one", "two"])
    True
    >>> has1dup(["one", "two", "three"])
    False
    >>> has1dup(["one", "one", "one"])
    False
    

    I have successfully done this using if/else statements. However, if the list were larger, the task of writing out each possibility for a pair would become very difficult and time consuming. Is there a faster/simpler way to accomplish this?

    Here is what I have tried:

    def has1dup(lst):
        if lst[0] == lst[1] and lst[1] != lst[2]:
            return True
        elif lst[1] == lst[2] and lst[2] != lst[0]:
            return True
        elif lst[0] == lst[2] and lst[2] != lst[1]:
            return True
        else:
            return False
    
  • Michael Mior
    Michael Mior about 12 years
    This checks that the list only contains two distinct elements, which isn't the same thing.
  • Michael Mior
    Michael Mior about 12 years
    There's also the concern of dealing with larger lists.
  • Jochen Ritzel
    Jochen Ritzel about 12 years
    It checks if exactly two items in the list are the same. You cant generalize to find tripples though (they might just be two pairs), but that wasn't the question.
  • paxdiablo
    paxdiablo about 12 years
    I'm not sure that will work properly on the list 1,1,2,2. Isn't it supposed to be exactly two elements?
  • paxdiablo
    paxdiablo about 12 years
    pass is how you do that in Python, by the way.
  • paxdiablo
    paxdiablo about 12 years
    +1, same as the solution I eventually came up with (after misreading question) but yours was more succinct.
  • minopret
    minopret about 12 years
    The title of the question says two of three, and the question asks for the most efficient. > python -c 'from timeit import Timer; t1 = Timer("len(set(a)) == 2", setup="""a=["one","one","two"]"""); t2 = Timer("len(set(a)) == len(a) - 1", setup="""a=["one","one","two"]"""); print t1.timeit(), t2.timeit()' 2.32463097572 2.84631896019
  • paxdiablo
    paxdiablo about 12 years
    Still the same problem as with one other solution. The list 1,2,2,3,3,3,4,4,4,4,5,5,5,5,5 will return true in this case, no? That's not exactly two dupes. Even allowing for a list size of three, 1,1,1 will return true for your function despite it being wrong.
  • paxdiablo
    paxdiablo about 12 years
    And for one/two/two, what does this do (once you fix the indentation of course)?
  • paxdiablo
    paxdiablo about 12 years
    Sorry, okm, I have reversed my downvote because the question title does indeed specify two of three - it's only a section in the question body that asks about longer lists (as an additional point).
  • paxdiablo
    paxdiablo about 12 years
    And downvote reversed though you should probably make it clear that this works only for the three-item list.
  • Ashwini Chaudhary
    Ashwini Chaudhary about 12 years
    @paxdiablo indentation is perfect(for-else loop). code is running fine and i added one/two/two to the solution, it returns 'yes'.
  • paxdiablo
    paxdiablo about 12 years
    @MichaelMior, I've added to the end for larger lists but the question title explicitly stated two of three. The body asked for a solution where the list is larger which is why I added the set variant.
  • Gareth Latty
    Gareth Latty about 12 years
    Added the even narrower solution.
  • paxdiablo
    paxdiablo about 12 years
    for-else - well, bugger me, I never knew such a beast even existed. +1 for educating me. One question, why would you just not return no no matter what (ie, ditch the else line)?
  • jdferreira
    jdferreira about 12 years
    This only works if the elements in the list implement __hash__ (which is true of most builtin-types), but stuff like has1dup([[1], [1], [2]]) will raise an error, since lists are not hashable. Even user defined objects will not behave as expected if hash is not "significant". has1dup([X(), X(), Y()]) for X and Y defined as empty classes returns False. Since the question uses strings, this shouldn't be a problem, though.
  • Akavall
    Akavall about 12 years
    The question asked about exactly two elements, how would your code work for [1,1,2,2] ?
  • paxdiablo
    paxdiablo about 12 years
    @Akavall, the question title specified two from three. There's an extra question in the body about how to handle larger lists but I still consider an answer on the primary question to be useful. That's my thoughts on the matter though, as my wife would say (often and without much encouragement), I've been wrong before :-)
  • Akavall
    Akavall about 12 years
    @paxdiable, OK, I see. Thank You.
  • Ashwini Chaudhary
    Ashwini Chaudhary about 12 years
    @paxdiablo well the trick is, in for-else loop the else part runs only when the for loop completes successfully(without any break or return)
  • okm
    okm about 12 years
    @paxdiablo that's fine, the question is not that clear and determined, I've updated the answer also.
  • okm
    okm about 12 years
    @paxdiablo this latter solution is incomplete as it's assuming there is not any other items having same values.
  • okm
    okm about 12 years
    This solution is incomplete as it's assuming there are not any other items having same values. However for sequence having length of 3, its fine, because there indeed does not have space for other items having same values.
  • Michael Mior
    Michael Mior about 12 years
    The full question states "However, if the list were larger, the task of writing out each possibility for a pair would become very difficult and time consuming." and makes no mention of lists with only three elements. Perhaps some clarification from the OP is required.
  • paxdiablo
    paxdiablo about 12 years
    @okm, not sure I understand. Since the set removes dupes, the only situation in which the length of the set is one less than the length of the array is when every element in the array is unique except for one, which has exactly two entries. Any other case will involve a length difference of something other than 1. For example, 1,2,2,2,3 will result in a difference of 2, 1,2,3 gives a difference of 0, 1,2,2,3,3,3,4,4,4,4 gives a difference of 6 and so on.
  • paxdiablo
    paxdiablo about 12 years
    @okm, perhaps you might come up with an example of what you mean. As per my response to your comment on my answer, I believe this works fine on any size list.
  • paxdiablo
    paxdiablo about 12 years
    @Michael, as stated, the title states "two of three" as does the example in the question. I see if "if the list were larger" bit as a secondary question.
  • okm
    okm about 12 years
    @paxdiablo probably I misunderstood the words "exactly two elements in a list are the same": if it means there are ONLY two elements having same value, you and Jochen are right. Just thought it meant there are two elements having same value...
  • okm
    okm about 12 years
    @paxdiablo About your comment "the list 1,1,2,2. Isn't it supposed to be exactly two elements?", did you implicitly mean that [1, 1, 2, 2] is supposed to be exactly two elements or negative? I'm not native English speaker and want to make sure the actual meaning of your words. If it's the first one, seems you was confused as well =p
  • paxdiablo
    paxdiablo about 12 years
    I think 1,1,2,2 should return false since it doesn't have exactly two values that are identical - it actually has two groups of two. So 1,1,2,3,4,5,6,7 would be true but 1,1,2,2,3,4,5,6,7 would give false. That's my understanding but, as everyone points out, the question could be clearer.
  • gnr
    gnr almost 12 years
    [1,2,2,3,3] will return True, which doesn't fulfill the 'exactly' requirement
  • ninjagecko
    ninjagecko almost 12 years
    @mlauria: It does, unless you mean "exactly" as in "among the multiplicities of elements, there is exactly one element with multiplicity 2". In which case: Counter(Counter(yourList).values()).get(2)==1
  • ninjagecko
    ninjagecko almost 12 years
    This is an O(N^2) algorithm. Do not use this for large lists (e.g. 1000+ elements) or where performance is a factor.
  • Gareth Latty
    Gareth Latty almost 12 years
    @ninjagecko Indeed, given the OP is talking about length 3 lists and possibly longer, I presumed that performance would not be a huge issue.