Comparing two dictionaries and checking how many (key, value) pairs are equal

696,413

Solution 1

If you want to know how many values match in both the dictionaries, you should have said that :)

Maybe something like this:

shared_items = {k: x[k] for k in x if k in y and x[k] == y[k]}
print(len(shared_items))

Solution 2

What you want to do is simply x==y

What you do is not a good idea, because the items in a dictionary are not supposed to have any order. You might be comparing [('a',1),('b',1)] with [('b',1), ('a',1)] (same dictionaries, different order).

For example, see this:

>>> x = dict(a=2, b=2,c=3, d=4)
>>> x
{'a': 2, 'c': 3, 'b': 2, 'd': 4}
>>> y = dict(b=2,c=3, d=4)
>>> y
{'c': 3, 'b': 2, 'd': 4}
>>> zip(x.iteritems(), y.iteritems())
[(('a', 2), ('c', 3)), (('c', 3), ('b', 2)), (('b', 2), ('d', 4))]

The difference is only one item, but your algorithm will see that all items are different

Solution 3

def dict_compare(d1, d2):
    d1_keys = set(d1.keys())
    d2_keys = set(d2.keys())
    shared_keys = d1_keys.intersection(d2_keys)
    added = d1_keys - d2_keys
    removed = d2_keys - d1_keys
    modified = {o : (d1[o], d2[o]) for o in shared_keys if d1[o] != d2[o]}
    same = set(o for o in shared_keys if d1[o] == d2[o])
    return added, removed, modified, same

x = dict(a=1, b=2)
y = dict(a=2, b=2)
added, removed, modified, same = dict_compare(x, y)

Solution 4

dic1 == dic2

From python docs:

The following examples all return a dictionary equal to {"one": 1, "two": 2, "three": 3}:

>>> a = dict(one=1, two=2, three=3)
>>> b = {'one': 1, 'two': 2, 'three': 3}
>>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
>>> d = dict([('two', 2), ('one', 1), ('three', 3)])
>>> e = dict({'three': 3, 'one': 1, 'two': 2})
>>> a == b == c == d == e
True

Providing keyword arguments as in the first example only works for keys that are valid Python identifiers. Otherwise, any valid keys can be used.

Comparison is valid for both python2 and python3.

Solution 5

Since it seems nobody mentioned deepdiff, I will add it here for completeness. I find it very convenient for getting diff of (nested) objects in general:

Installation

pip install deepdiff

Sample code

import deepdiff
import json

dict_1 = {
    "a": 1,
    "nested": {
        "b": 1,
    }
}

dict_2 = {
    "a": 2,
    "nested": {
        "b": 2,
    }
}

diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(diff, indent=4))

Output

{
    "values_changed": {
        "root['a']": {
            "new_value": 2,
            "old_value": 1
        },
        "root['nested']['b']": {
            "new_value": 2,
            "old_value": 1
        }
    }
}

Note about pretty-printing the result for inspection: The above code works if both dicts have the same attribute keys (with possibly different attribute values as in the example). However, if an "extra" attribute is present is one of the dicts, json.dumps() fails with

TypeError: Object of type PrettyOrderedSet is not JSON serializable

Solution: use diff.to_json() and json.loads() / json.dumps() to pretty-print:

import deepdiff
import json

dict_1 = {
    "a": 1,
    "nested": {
        "b": 1,
    },
    "extra": 3
}

dict_2 = {
    "a": 2,
    "nested": {
        "b": 2,
    }
}

diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(json.loads(diff.to_json()), indent=4))  

Output:

{
    "dictionary_item_removed": [
        "root['extra']"
    ],
    "values_changed": {
        "root['a']": {
            "new_value": 2,
            "old_value": 1
        },
        "root['nested']['b']": {
            "new_value": 2,
            "old_value": 1
        }
    }
}

Alternative: use pprint, results in a different formatting:

import pprint

# same code as above

pprint.pprint(diff, indent=4)

Output:

{   'dictionary_item_removed': [root['extra']],
    'values_changed': {   "root['a']": {   'new_value': 2,
                                           'old_value': 1},
                          "root['nested']['b']": {   'new_value': 2,
                                                     'old_value': 1}}}
Share:
696,413

Related videos on Youtube

user225312
Author by

user225312

Updated on March 27, 2022

Comments

  • user225312
    user225312 about 2 years

    I have two dictionaries, but for simplification, I will take these two:

    >>> x = dict(a=1, b=2)
    >>> y = dict(a=2, b=2)
    

    Now, I want to compare whether each key, value pair in x has the same corresponding value in y. So I wrote this:

    >>> for x_values, y_values in zip(x.iteritems(), y.iteritems()):
            if x_values == y_values:
                print 'Ok', x_values, y_values
            else:
                print 'Not', x_values, y_values
    

    And it works since a tuple is returned and then compared for equality.

    My questions:

    Is this correct? Is there a better way to do this? Better not in speed, I am talking about code elegance.

    UPDATE: I forgot to mention that I have to check how many key, value pairs are equal.

  • user225312
    user225312 over 13 years
    @THC4k, sorry for not mentioning. But I have to check how many values match in both the dictionaries.
  • user225312
    user225312 over 13 years
    Ok, so based on my update, is my way of doing still incorrect?
  • Jochen Ritzel
    Jochen Ritzel over 13 years
    @A A: I added why your's doesn't work when you want to count.
  • user225312
    user225312 over 13 years
    I see, but in my case both the dictionaries are of same length. And they will always be, because that is how the program works.
  • Tim Tisdall
    Tim Tisdall almost 9 years
    Unfortunately this doesn't work if the values in the dict are mutable (ie not hashable). (Ex {'a':{'b':1}} gives TypeError: unhashable type: 'dict')
  • Tim Tisdall
    Tim Tisdall almost 9 years
    This one actually handles mutable values in the dict!
  • Diego Tercero
    Diego Tercero over 8 years
    It seems that the task is not only to check if the contents of both are the same but also to give a report of the differences
  • Bruno
    Bruno over 8 years
    That's completly wrong, just parsing the data into json is really slow. Then hashing that huge sring you just created is even worse. You should never do that
  • WoJ
    WoJ over 8 years
    @Bruno: quoting the OP: "Better not in speed, I am talking about code elegance"
  • Bruno
    Bruno over 8 years
    It's not elegant at all, it feels unsafe and it's overly complicated for a really simple problem
  • WoJ
    WoJ over 8 years
    @Bruno: elegance is subjective. I can understand that you do not like it (and probably downvoted). This is not the same as "wrong".
  • Mutant
    Mutant over 8 years
    Same error if there is list element for the dict key. I think cmp is better way to do it unless I am missing anything.
  • Trey Hunner
    Trey Hunner over 8 years
    I believe this is identical to dict1 == dict2
  • nerdwaller
    nerdwaller over 8 years
    For anyone using Python3.5, the cmp built in has been removed (and should be treated as removed before. An alternative they propose: (a > b) - (a < b) == cmp(a, b) for a functional equivalent (or better __eq__ and __hash__)
  • AnnanFay
    AnnanFay over 8 years
    @Mutant that is a different issue. You cannot create a dictionary with a list key in the first place. x = {[1,2]: 2} will fail. The question already has valid dicts.
  • msc87
    msc87 about 8 years
    can we use cmp(dict1,dict2) for comparing dictionaries with values as unordered list like: dict1={'a':[0,1]} and dict2={'b':[1,0]}?
  • Natim
    Natim about 8 years
    This is a great answer. json.dumps(d, sort_keys=True) will give you canonical JSON so that you can be certain that both dict are equivalent. Also it depends what you are trying to achive. As soon as the value are not JSON serizalizable it will fail. For thus who say it is inefficient, have a look at the ujson project.
  • ribamar
    ribamar over 7 years
    @annan: wrong, the question is generic. the example in the question description has already "valid dicts". If I post a new question, with same title, but with a different "invalid" dict, somebody will mark it as duplicate. Downvoting.
  • AnnanFay
    AnnanFay over 7 years
    @ribamar the question is "Comparing two dictionaries [...]". The 'invalid dict' above with list keys is not valid python code - dict keys must be immutable. Therefore you are not comparing dictionaries. If you try and use a list as a dictionary key your code will not run. You have no objects for which to compare. This is like typing x = dict(23\;dfg&^*$^%$^$%^) then complaining how the comparison does not work with the dictionary. Of course it will not work. Tim's comment on the other hand is about mutable values, hence why I said that these are different issues.
  • ribamar
    ribamar over 7 years
    @annan: Ok, your comment about Mutant's question is correct (I misunderstood your concept of 'valid dict': there's nothing invalid doing l = [] ; d = {1 : l}. d is valid dictionary and does not work as the same with the proposed solution. But now I see you're not denying that)
  • NuclearPeon
    NuclearPeon over 7 years
    Does not work for python3. Do this as the last line instead: list(zip(x.items(), y.items()))
  • Xanthir
    Xanthir over 7 years
    Once you dump the string to JSON, you can just compare it directly. Hashing the two strings is just meaningless extra complexity. (Also, this only works if the dict is JSON-able, which lots aren't.)
  • Stefano
    Stefano over 7 years
    @nerdwaller - dicts are not orderable types, so dict_a > dict_b would raise a TypeError: unorderable types: dict() < dict()
  • nerdwaller
    nerdwaller over 7 years
    @Stefano: Good call, my comment was more for general comparison in python (I wasn't paying attention to the actual answer, my mistake) .
  • erik258
    erik258 over 7 years
    This may not do what was exactly requested, and pulls in the json std lib, but it does work ( as json.dumps is deterministic with the default settings ).
  • MikeyE
    MikeyE about 7 years
    @TimTisdall You can get around the TypeError if you test both keys and items like this: set(x.keys()) and set(x.values()). But, that only works if the dictionary has only one level of keys and values. If you have a nested {} or [] you'd need to perform the same operation recursively.
  • Tim Tisdall
    Tim Tisdall about 7 years
    @MikeyE - set requires values to be hashable and dict requires keys to be hashable. set(x.keys()) will always work because keys are required to be hashable, but set(x.values()) will fail on values that aren't hashable.
  • Afflatus
    Afflatus about 7 years
    When I run this, I still get an error seeing dealing with the mutable values: ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
  • Afflatus
    Afflatus about 7 years
    @TimTisdall For comparing dictionaries with unhashable items, see: stackoverflow.com/questions/43504568/…
  • Daniel Myers
    Daniel Myers almost 7 years
    @Afflatus - DataFrames by design don't allow truthy comparisons (unless it has a length of 1) as they inherit from numpy.ndarray. -credit to stackoverflow.com/a/33307396/994076
  • Qi Luo
    Qi Luo almost 7 years
    I don't agree with @ErkinAlpGüney. Could you provide a proof?
  • Matthew Nakayama
    Matthew Nakayama over 6 years
    I disagree with @ErkinAlpGüney. The official documentation shows that == does indeed compare dictionaries by value, not by address. docs.python.org/2/library/stdtypes.html#mapping-types-dict
  • Jesuisme
    Jesuisme over 6 years
    Works for Python 2.7.13
  • ankostis
    ankostis about 6 years
    Does not work with OrderedDict: odict=OrderedDict; assert odict([(1, 1), (2, 2)]) == odict([(2, 2), (1, 1)]) fails!
  • Pedro Lobito
    Pedro Lobito about 6 years
    @ankostis: OrderedDict != dict
  • EL_DON
    EL_DON about 6 years
    If you use not isinstance(dict1, dict) instead of type(dict1) is not dict, this will work on other classes based on dict. Also, instead of (dict1[key] == dict2[key]), you can do all(atleast_1d(dict1[key] == dict2[key]))` to handle arrays at least.
  • styrofoam fly
    styrofoam fly almost 6 years
    @PedroLobito passing tests is not a proof for the code to be correct.
  • Pedro Lobito
    Pedro Lobito almost 6 years
    @styrofoamfly I'm not following your line of thought, care to clarify?
  • styrofoam fly
    styrofoam fly almost 6 years
    @PedroLobito you cannot prove that "for each input the algorithm produces the expected output" (the definition of correctness). The tests prove only that "for some inputs the algorithm produces the expected output".
  • Phil
    Phil almost 6 years
    As of Python 3.6, dict is orderd out-of-the-box.
  • nkadwa
    nkadwa over 5 years
    Let's improve it so it works both ways. Line 2: "for x1 in set(dict1.keys()).union(dict2.keys()):"
  • Pedro Lobito
    Pedro Lobito over 5 years
    Can you please provide an input where this isn't true?
  • James Hirschorn
    James Hirschorn over 5 years
    @PedroLobito make a dict with keys mapping to array if you want to see dict1 == dict2 fail with an error message.
  • Filnor
    Filnor over 5 years
    Welcome to Stack Overflow! While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. Please also try not to crowd your code with explanatory comments, this reduces the readability of both the code and the explanations!
  • zwep
    zwep over 4 years
    Thanks @nkadwa, it does now
  • Pedro Lobito
    Pedro Lobito over 4 years
    @JamesHirschorn The answer is only valid for the examples it shows.
  • James Hirschorn
    James Hirschorn over 4 years
    @PedroLobito That is irrelevant. You had asked "Can you please provide an input where this isn't true?", and this is what I was commenting on.
  • pfabri
    pfabri almost 4 years
    +1, but you could break out of your for loop as soon as your dicts_are_equal becomes false. There's no need to continue any further.
  • JavaSa
    JavaSa over 3 years
    @JochenRitzel: How this works for deep nested dictionaries inside dicts? I tried this on deep nested dicts, with exactly the same value but order was different, and comparison failed
  • thiezn
    thiezn over 3 years
    I was suprised myself but it seems I can just compare nested dicts out of the box with == (using python3.8). >>> dict2 = {"a": {"a": {"a": "b"}}} >>> dict1 = {"a": {"a": {"a": "b"}}} >>> dict1 == dict2 True >>> dict1 = {"a": {"a": {"a": "a"}}} >>> dict1 == dict2 False
  • Stevko
    Stevko almost 3 years
    this one is especially useful for unit testing: self.assertDictEqual(result, expected_result)
  • luca.giovagnoli
    luca.giovagnoli over 2 years
    What about In [6]: x = {"key": True} In [7]: y = {"key": 1} In [8]: x == y Out[8]: True ?
  • Admin
    Admin over 2 years
    Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.