How to turn a list into nested dict in Python

15,878

Solution 1

X = [['A', 'B', 'C'], ['A', 'B', 'D'],['W','X'],['W','Y','Z']]
d = {}

for path in X:
    current_level = d
    for part in path:
        if part not in current_level:
            current_level[part] = {}
        current_level = current_level[part]

This leaves us with d containing {'A': {'B': {'C': {}, 'D': {}}}, 'W': {'Y': {'Z': {}}, 'X': {}}}. Any item containing an empty dictionary is either a file or an empty directory.

Solution 2

Assuming that {'C', 'D'} means set(['C', 'D']) and your Python version supports dict comprehension and set comprehension, here's an ugly but working solution:

>>> tr = [[1, 2, 3], [1, 2, 4], [5, 6, 7]]
>>> {a[0]: {b[1]: {c[2] for c in [y for y in tr if y[1] == b[1]]} for b in [x for x in tr if x[0] == a[0]]} for a in tr}
{1: {2: set([3, 4])}, 5: {6: set([7])}}

As for your example:

>>> X = [['A', 'B', 'C'], ['A', 'B', 'D']]
>>> {a[0]: {b[1]: {c[2] for c in [y for y in X if y[1] == b[1]]} for b in [x for x in X if x[0] == a[0]]} for a in X}
{'A': {'B': set(['C', 'D'])}}

But please don't use it in a real-world application :)

UPDATE: here's one that works with arbitrary depths:

>>> def todict(lst, d=0):
...     print lst, d
...     if d > len(lst):
...         return {}
...     return {a[d]: todict([x for x in X if x[d] == a[d]], d+1) for a in lst}
...
>>> todict(X)
{'A': {'B': {'C': {}, 'D': {}}}}

Solution 3

There is a logical inconsistency in your problem statement. If you really want ['xyz/123/file.txt', 'abc/456/otherfile.txt']

to be changed to {'xyz': {'123': 'file.txt}, 'abc': {'456': 'otherfile.txt'}}

Then you have to answer how a path 'abc.txt' with no leading folder would be inserted into this data structure. Would the top-level dictionary key be the empty string ''?

Solution 4

I got asked about this question on twitter and came up with this slick solution using functional programming which I figure I might as well share here.

from functools import reduce
X = [['A', 'B', 'C'], ['A', 'B', 'D']]
Y = [reduce(lambda x, y: {y:x}, Y[::-1]) for Y in X]

which returns:

[{'A': {'B': 'C'}}, {'A': {'B': 'D'}}]

as desired.

For the simpler problem where you have one list that you want to represent as a dict with nested keys, this will suffice:

from functools import reduce
X = ['A', 'B', 'C']
reduce(lambda x, y: {y:x}, X[::-1])

which returns:

{'A': {'B': 'C'}}

Solution 5

This should be pretty close to what you need:

def path_to_dict(path):
    parts = path.split('/')

    def pack(parts):
        if len(parts) == 1:
            return parts
        elif len(parts):
            return {parts[0]: pack(parts[1:])}
        return parts

    return pack(parts)

if __name__ == '__main__':
    paths = ['xyz/123/file.txt', 'abc/456/otherfile.txt']
    for path in paths:
        print '%s -> %s' % (path, path_to_dict(path))

Results in:

xyz/123/file.txt -> {'xyz': {'123': ['file.txt']}}
abc/456/otherfile.txt -> {'abc': {'456': ['otherfile.txt']}}
Share:
15,878
Dmitry Grinberg
Author by

Dmitry Grinberg

Updated on June 05, 2022

Comments

  • Dmitry Grinberg
    Dmitry Grinberg almost 2 years

    Need to turn x:

    X = [['A', 'B', 'C'], ['A', 'B', 'D']]
    

    Into Y:

    Y = {'A': {'B': {'C','D'}}}
    

    More specifically, I need to create a tree of folders and files from a list of absolute paths, which looks like this:

    paths = ['xyz/123/file.txt', 'abc/456/otherfile.txt']
    

    where, each path is split("/"), as per ['A', 'B', 'C'] in the pseudo example.

    As this represents files and folders, obviously, on the same level (index of the array) same name strings can't repeat.

    • FogleBird
      FogleBird over 12 years
      What's this? {'C','D'}
    • Steven Rumbalski
      Steven Rumbalski over 12 years
      What problem are you trying to solve with the nested dict?
    • Steven Rumbalski
      Steven Rumbalski over 12 years
      What if folder A contains a directory B and also a file X? How should that be represented?
    • FogleBird
      FogleBird over 12 years
      Would this be acceptable? {'A': {'B': {'C': {}, 'D': {}}}}
    • Dmitry Grinberg
      Dmitry Grinberg over 12 years
      more consistent, Y = {'A': {'B': {'C':{},'D':{}}}}
    • Dmitry Grinberg
      Dmitry Grinberg over 12 years
      Steven, I'm actually trying to create a page with nested ul/li blocks, to render a javascript tree. the way content is saved, though, it's just a list of records, with URL property, that represents an absolute path. I can't just show a flat list, so need to show it as if it's a list of nested directories with files in them.
    • Karl Knechtel
      Karl Knechtel over 12 years
      Where does the input data come from? If you want to iterate over a directory structure, the normal approach is to just use os.walk.
  • agf
    agf over 12 years
    Also doesn't work for arbitrary depth, which is probably a requirement.
  • Attila O.
    Attila O. over 12 years
    @agf true - I didn't think that was a requirement, it probably is though. Should I come up with an even more horrible looking one-liner? :)
  • agf
    agf over 12 years
    Hah, that would just get scary. I think Matt's got the idea -- it has to be done recursively.
  • agf
    agf over 12 years
    How do you join all the dictionaries, like if you have other paths like xyz/123/b.a and abc/sss/ooo/1.a that overlap with those?
  • Dmitry Grinberg
    Dmitry Grinberg over 12 years
    if there is a file called abc.txt at top level, it'll be a peer with 'xyz' and 'abc', as in {'xyz':{...}, 'abc':{...}, 'abc.txt:{}} . with empty {}, which would indicate it's a leaf
  • Dmitry Grinberg
    Dmitry Grinberg over 12 years
    it doesn't matter if something "looks" like a file or folder. last one will be a file, rest are folders. this is not real file system, btw.
  • Steven Rumbalski
    Steven Rumbalski over 12 years
    What if 'xyz' contains directories and files?
  • Dmitry Grinberg
    Dmitry Grinberg over 12 years
    Matt, I was about to ask same question as agf. This far i got myself. I actually need one dict merged. there will be overlaps, so I can't just replace nodes wholesale, need to actually merge at every level.
  • Matt Williamson
    Matt Williamson over 12 years
  • Tyler Hilbert
    Tyler Hilbert about 7 years
    Where does the current_level get added to d?
  • tsveti_iko
    tsveti_iko over 5 years
    @Tyler Hilbert the current_level is the d! The expression current_level = d doesn't make a copy of d, it just creates a new reference to d named current_level ;) The code above wouldn't work if current_level was a copy, created like this: current_level = d[:] or like this: current_level = copy.deepcopy(d), but since it's not - it merely works!
  • ThePhi
    ThePhi almost 4 years
    @tsveti_iko If it's a reference, why doesn't the line current_level = current_level[part] modify only current_level and not d? I've checked in the debugger, and after this stage (for ex. when we are at the 1st element A), we'll get: current_level = {} while d = {'A': {}}. Thanks!
  • SpectreVert
    SpectreVert over 2 years
    above link is dead redirects to spam