Build tree out of list of parent/children in Python

15,401

Solution 1

I think the best is to do two pass. First, you create a dictionary that link the name to the node. Then you can add yours items efficiently.

My code :

nodes = dict((e["NAME"], node(e)) for e in l)
for e in l:
    if e["PARENT"] is not None:
        nodes[e["PARENT"]].add_children(nodes[e["NAME"])

If you want the roots, you can use the if above, or you can filter the nodes.

roots = [n for n in nodes.values() if d.value["PARENT"] is None]

Solution 2

I once represented a *ix process tree with just one dict, and a list of child process pid's for each parent pid. So you get:

dict_[1] = [2, 3, 4]
dict_[2] = [5, 100]
dict_[3] = [6, 200]
dict_[4] = [7, 300]
dict_[6] = [400]

It seemed to work quite well.

It's optional whether you want the leaf nodes to exist with empty lists, or to just not appear in the tree. I've shown them above not appearing in the tree at the dict level.

I believe this is only appropriate if a pid (node) can only appear in one place in the tree. EG, 100 can't be a child of 2 and 4.

Share:
15,401
Aaron
Author by

Aaron

Updated on June 04, 2022

Comments

  • Aaron
    Aaron almost 2 years

    Using Python, I have a list of dictionary objects that contain parent/child relationships between each other which I would like to build into a tree. For example:

    {'UI': 'T071', 'NAME': 'Entity', 'PARENT': None, 'CHILDREN': 'Conceptual Entity'}
    {'UI': 'T077', 'NAME': 'Conceptual Entity', 'PARENT': 'Entitity', 'CHILDREN': 'Organism Attribute, Finding, Idea or Concept'}
    {'UI': 'T032', 'NAME': 'Organism Attribute', 'PARENT': 'Conceptual Entity', 'CHILDREN': 'Clinical Attribute'}
    etc.
    

    There are a total of 4 root nodes in the dataset (with 'PARENT' set as None), which make 4 separate trees. So, I was planning to make a list of trees.

    The data is not necessarily in any kind of ordering (so the nodes higher up the hierarchy are not necessarily higher in the list). Also, the id's (UI) are in no particular order (T071 is not necessarily higher in the tree than T072). Their names are unique, and the dataset uses their names instead of the id's (UI) to show the relationships.

    I have this simple class:

    class node():
        def __init__(self, value):
            self.value = value
            self.children = []
    
        def add_child(self, obj):
            self.children.append(obj)
    

    I'm a bit stumped on how to approach this. Suggestions much appreciated.