Easy way to find Subtree in a Tree

14,692

Solution 1

Looks like a straightforward algorithm: Find the root of the search tree in the game tree and check whether the children of the search tree are a subset of the children in the game tree.

From your explanations, I'm not sure whether the search tree

  A
 / \
A   A

should match this tree:

  A
 /|\
A C A

(i.e. if non-matching children are supposed to be ignored.)

Anyway, here's the code I just toyed around with. It's a fully running example and comes with a main method and a simple Node class. Feel free to play with it:

import java.util.Vector;

public class PartialTreeMatch {
    public static void main(String[] args) {
        Node testTree = createTestTree();
        Node searchTree = createSearchTree();

        System.out.println(testTree);
        System.out.println(searchTree);

        partialMatch(testTree, searchTree);
    }

    private static boolean partialMatch(Node tree, Node searchTree) {
        Node subTree = findSubTreeInTree(tree, searchTree);
        if (subTree != null) {
            System.out.println("Found: " + subTree);
            return true;
        }
        return false;
    }

    private static Node findSubTreeInTree(Node tree, Node node) {
        if (tree.value == node.value) {
            if (matchChildren(tree, node)) {
                return tree;
            }
        }

        Node result = null;
        for (Node child : tree.children) {
            result = findSubTreeInTree(child, node);

            if (result != null) {
                if (matchChildren(tree, result)) {
                    return result;
                }
            }
        }

        return result;
    }

    private static boolean matchChildren(Node tree, Node searchTree) {
        if (tree.value != searchTree.value) {
            return false;
        }

        if (tree.children.size() < searchTree.children.size()) {
            return false;
        }

        boolean result = true;
        int treeChildrenIndex = 0;

        for (int searchChildrenIndex = 0;
                 searchChildrenIndex < searchTree.children.size();
                 searchChildrenIndex++) {

            // Skip non-matching children in the tree.
            while (treeChildrenIndex < tree.children.size()
                  && !(result = matchChildren(tree.children.get(treeChildrenIndex),
                                              searchTree.children.get(searchChildrenIndex)))) {
                treeChildrenIndex++;
            }

            if (!result) {
                return result;
            }
        }

        return result;
    }

    private static Node createTestTree() {
        Node subTree1 = new Node('A');
        subTree1.children.add(new Node('A'));
        subTree1.children.add(new Node('A'));

        Node subTree2 = new Node('A');
        subTree2.children.add(new Node('A'));
        subTree2.children.add(new Node('C'));
        subTree2.children.add(subTree1);

        Node subTree3 = new Node('B');
        subTree3.children.add(subTree2);

        Node root = new Node('A');
        root.children.add(subTree3);

        return root;
    }

    private static Node createSearchTree() {
        Node root = new Node('A');
        root.children.add(new Node('A'));
        root.children.add(new Node('A'));

        return root;
    }
}

class Node {
    char value;
    Vector<Node> children;

    public Node(char val) {
        value = val;
        children = new Vector<Node>();
    }

    public String toString() {
        StringBuilder sb = new StringBuilder();

        sb.append('(');
        sb.append(value);

        for (Node child : children) {
            sb.append(' ');
            sb.append(child.toString());
        }

        sb.append(')');

        return sb.toString();
    }
}

Solution 2

Are you looking for any particular constraints on a subtree? If not, a simple traversal should suffice for identifying subtrees (basically treat each node as the root of a subtree).

I believe you'll find that the API you'll want for a tree varies a great deal by your particular application -- so much that generic implementations are not very useful. Perhaps if you could tell us what kind of application the tree will be used for, we could provide particulars.

Also, if you're just using a tree for data storage, you might want to ask yourself why you need a tree. That answer should also answer the question in my previous paragraph.

Share:
14,692
cdmckay
Author by

cdmckay

CTO and Co-Founder at Process Street Open Source Programmer (https://github.com/cdmckay)

Updated on June 12, 2022

Comments

  • cdmckay
    cdmckay almost 2 years

    I'm writing some code that uses a Tree (a regular tree that can have an unlimited number of nodes, but no crossover, i.e. two parent nodes will not point the the same child node). Anyway, two things:

    1) Are there any well-known algorithms for finding a sub-tree within a tree.

    2) Are there any Java libraries (or any libraries for that matter) that already implement this algorithm? Even if there are none, can anyone recommend any good general purpose Java tree library?

    I want to use these trees for holding data in a tree format, not for their searching capabilities.

    To expand a bit: I'm using the tree as part of game to keep a history of what happens when a certain events happen. For example, an A can hit a B which can hit two A's which can hit another two A's etc.

    That would look something like:

        A
        |
        B
       /
      A 
     / \  
    A   A
       / \
      A   A
    

    Of course there's more than just A and B. What I want to do is (for an achievement system) is be able to tell when, say an A has hit two A's:

      A
     / \
    A   A
    

    I want to be able to easily know if the first tree contains that subtree. And I don't want to have to write all the code for doing so if I don't have to :)

  • Michael Borgwardt
    Michael Borgwardt about 15 years
    Pretty sure that's not possible, since all "tree algorithms" are based on having each node contain some information about its children (such as their maximum and minimum values), and that's not the case here.
  • Ken
    Ken about 15 years
    I don't see how that follows. A char in a str doesn't contain info about future chars. In this case, the root A matches the root of the target, but its child B doesn't match, and now I should be able to skip checking B against the root-A of the target pattern.
  • Pete Kirkham
    Pete Kirkham about 15 years
    citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.3770 uses a string representation of the tree to give sub-linear matching.