Merge dictionaries retaining values for duplicate keys

python merge duplicates

12,574

Solution 1

def merge_dicts(*dicts):
    d = {}
    for dict in dicts:
        for key in dict:
            try:
                d[key].append(dict[key])
            except KeyError:
                d[key] = [dict[key]]
    return d

This retuns:

{'a': [1, 5], 'b': [2, 4], 'c': [3], 'd': [6]}

There is a slight difference to the question. Here all dictionary values are lists. If that is not to be desired for lists of length 1, then add:

    for key in d:
        if len(d[key]) == 1:
            d[key] = d[key][0]

before the return d statement. However, I cannot really imagine when you would want to remove the list. (Consider the situation where you have lists as values; then removing the list around the items leads to ambiguous situations.)

Solution 2

Python provides a simple and fast solution to this: the defaultdict in the collections module. From the examples in the documentation:

Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists:

>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items() [('blue', [2, 4]), ('red', 1), ('yellow', [1, 3])]

When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list.

In your case, that would be roughly:

import collections

def merge_dicts(*dicts):
    res = collections.defaultdict(list)
    for d in dicts:
        for k, v in d.iteritems():
            res[k].append(v)
    return res

>>> merge_dicts(d1, d2, d3)
defaultdict(<type 'list'>, {'a': [1, 5], 'c': [3], 'b': [2, 4], 'd': [6]})

12,574

Author by

lagunazul

Updated on June 20, 2022

Comments

lagunazul almost 2 years

Given n dictionaries, write a function that will return a unique dictionary with a list of values for duplicate keys.

Example:

d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'b': 4}
d3 = {'a': 5, 'd': 6}

result:

>>> newdict
{'c': 3, 'd': 6, 'a': [1, 5], 'b': [2, 4]}

My code so far:

>>> def merge_dicts(*dicts):
...     x = []
...     for item in dicts:
...         x.append(item)
...     return x
...
>>> merge_dicts(d1, d2, d3)
[{'a': 1, 'b': 2}, {'c': 3, 'b': 4}, {'a': 5, 'd': 6}]

What would be the best way to produce a new dictionary that yields a list of values for those duplicate keys?