How to find the shortest simple path in a Tree in a linear time?

17,810

This problem is pretty much equivalent to the minimum sum subsequence problem, and can be solved in a similar manner by dynamic programming.

We will calculate the following arrays by using DF searches:

dw1[i] = minimum sum achievable by only using node i and its descendants.
pw1[i] = predecessor of node i in the path found for dw1[i].
dw2[i] = second minimum sum achevable by only using node i and its descendants,
         a path that is edge-disjoint relative to the path found for dw1[i].

If you can calculate these, take min(dw1[k], dw1[k] + dw2[k]) over all k. This is because your path will take one of these basic shapes:

  k              k
  |     or     /   \
  |           /     \
  | 

All of which are covered by the sums we're taking.

Calculating dw1

Run a DFS from the root node. In the DFS, keep track of the current node and its father. At each node, assume its children are d1, d2, ... dk. Then dw1[i] = min(min{dw1[d1] + cost[i, d1], dw1[d2] + cost[i, d2], ..., dw1[dk] + cost[i, dk]}, min{cost[i, dk]}). Set dw1[i] = 0 for leaf nodes. Don't forget to update pw1[i] with the selected predecessor.

Calculating dw2

Run a DFS from the root node. Do the same thing you did for dw1, except when going from a node i to one of its children k, only update dw2[i] if pw1[i] != k. You call the function recursively for all children however. It would look something like this in pseudocode:

df(node, father)
    dw2[node] = inf
    for all children k of node
        df(k, node)

        if pw1[node] != k
            dw2[node] = min(dw2[node], dw1[k] + cost[node, k], cost[node, k])
Share:
17,810
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin almost 2 years

    Here is a problem from Algorithms book by Vazirani

    The input to this problem is a tree T with integer weights on the edges. The weights may be negative, zero, or positive. Give a linear time algorithm to find the shortest simple path in T. The length of a path is the sum of the weights of the edges in the path. A path is simple if no vertex is repeated. Note that the endpoints of the path are unconstrained.

    HINT: This is very similar to the problem of finding the largest independent set in a tree.

    How can I solve this problem in linear time?

    Here is my algorithm but I'm wonder if it is linear time since it is nothing different than depth-first:

    1. Traverse tree (depth-first)
    2. Keep the indexes (nodes)
    3. add the values
    4. do (1) till the end of tree
    5. compare the sum and print the path and sum

    this problem is similar this topic but there is no certain answer.

  • Ankit Roy
    Ankit Roy about 13 years
    Looks promising but I have a few suggestions: (1) Under "Calculating dw1" when you say "descendants" I believe you mean "children" ("descendants" usually means all children and children's children, recursively); (2) The calculation of each of dw1[i], dw2[i] and up[i] should include 0 in the min{...} operation to allow the path to terminate there (so that for example a tree containing no negative-length edges will return a score of 0); (3) I believe your calculation of dw2[i] should exclude the optimal edge leading...
  • Ankit Roy
    Ankit Roy about 13 years
    ... from vertex i only, but at lower levels should include optimal edges (ask if you need clarification); (4) up[i] calculations can be simplified by noting that every path is of 1 of 2 kinds: either it is "straight down" or it contains an "uppermost corner" (a vertex v such that the parent of v is not in the path) -- any path containing an uppermost corner at some vertex v will be found by considering dw1[v] + dw2[v], so up[i] only needs to calculate costs on a straight path back towards the root (i.e. up[i] will be the minimum cost of a "straight-down" path ending at i).
  • IVlad
    IVlad about 13 years
    @j_random_hacker - (1) agreed, will update. (2) you mean a null path? I would consider a path to contain at least one edge, that's why I didn't add 0. It's easy enough to modify if the OP wants to accept null paths. (3) isn't that what happens? We're dealing with a tree, so if you exclude the optimal edge leading from i, you automatically exclude the rest as well. (4) I'm not sure what you mean here. In my solution, up doesn't store only straight paths from i to the root. If you do that, how will you get your answer? Becase the minimum mustn't necessarily be a straight path.
  • IVlad
    IVlad about 13 years
    More on (3) - if you mean the check shouldn't be done on lower levels, I think that's wrong, because then you will have overlapping min-cost and second min-cost paths for some descendants. They must be edge-disjoint for every node, otherwise you'll get wrong answers when doing dw1[k] + dw2[k], because the two paths will share common edges.
  • Ankit Roy
    Ankit Roy about 13 years
    @IVlad: (2) Indeed, including a 0 under the min() will allow null paths, but leaving it out completely will force all paths to end at a leaf -- definitely not what's wanted. If you want to allow paths that end before a leaf, but forbid null paths, this is certainly possible but requires keeping more scores for each node I think. (3) Not true -- remember it's a tree, so if you have 2 downward paths starting at v whose first edges are different, they will necessarily have no common vertices (except for v). ...
  • Ankit Roy
    Ankit Roy about 13 years
    ... So having dw2[i] avoid the optimal down-edge at level i is sufficient to guarantee that the down-paths represented by dw1[i] and dw2[i] are vertex-disjoint apart from i; avoiding optimal down-edges for dw2[i] at each level just makes dw2[i] worse than it needs to be.
  • Ankit Roy
    Ankit Roy about 13 years
    (4) I'm saying you can (and I think should) let up store only straight-up paths from i, because the case that you're worried about (when you say the minimum mustn't necessarily be a straight path) will be handled as follows: Suppose that the minimum path through vertex u goes through the edge to u's parent but is not "straight up". In that case, there is some vertex v above (closer to the root than) u which is the "uppermost corner" of this path, and when we scan through all vertices looking for the best path through each, we will find this path when we look at vertex v. Clear?
  • Ankit Roy
    Ankit Roy about 13 years
    In fact now that I think about it, up isn't needed at all if null paths are allowed, since any "straight-down" path is equivalent to a path with an "uppermost corner" in which one of the 2 "spokes" from that corner vertex is a null path. :)
  • IVlad
    IVlad about 13 years
    @j_random_hacker - true about (2). I think we can avoid null paths by also considering taking the single min-cost edge alone. Basically run minimum sum subsequence on the edges. This way the paths won't necessarily start and end at a leaf. As for (3), can you post some pseudocode? The function calls itself recursively for each child, and the paths ending at that child must also be edge-disjoint. dw2[i] won't be worse than it needs to be, because you will be using the dw1 of i 's children to compute it.
  • Ankit Roy
    Ankit Roy about 13 years
    "because you will be using the dw1 of i 's children to compute it" -- OK it sounds like you're already doing what I'm suggesting then. The description in your post could be read as saying that dw2 is computed recursively using dw2, which would give the suboptimality I was worried about. Maybe you could mention that the calculation for dw2[i] reverts to using dw1[i] on the children in the post?
  • IVlad
    IVlad about 13 years
    @j_random_hacker - done. Updated (5) too. Thanks for your help!
  • Ankit Roy
    Ankit Roy about 13 years
    Looking good, +1 at last :-P But do you see that up can be eliminated entirely by changing the overall answer for a node k to min(dw1[k], dw2[k], dw1[k] + dw2[k])? This changes the overall interpretation of the query on a node q from "What is the shortest path containing q?" to "What is the shortest path whose highest node is q?" -- but that's OK, because every path has a highest node. :)
  • IVlad
    IVlad about 13 years
    @j_random_hacker - I just realized that while you wrote your comment and got rid of it :). Is putting dw2[k] in the minimum necessary? It will never be smaller than dw1[k]. Some time ago I solved a problem which asked queries such as what's the shortest path containing q?, so I was set on using up. Realize it's not needed here though.
  • Ankit Roy
    Ankit Roy about 13 years
    Good point, dw2[k] isn't needed in the minimum as it can't be less than dw1[k]. Simpler and simpler!
  • becko
    becko over 11 years
    @j_random_hacker Any thoughts on what is the relationship between this problem and finding the largest independent set in a tree, as advertised in the hint?
  • Ankit Roy
    Ankit Roy over 11 years
    @becko: I don't actually see much connection I'm afraid... I do know that independent set in a tree can be solved in a broadly similar way, by calculating 2 values for each node with a DFS: the size of the largest independent set in the subtree rooted at node i that uses node i, and the size of the largest independent set in the subtree rooted at node i that doesn't. Call these x(i) and y(i) respectively. Then you can calculate x(i) by taking 1 + sum(y(j)) for each child j of i, and you can calculate y(i) by taking sum(x(j)) for each child j of i.
  • becko
    becko over 11 years
    @j_random_hacker: Thanks. Maybe that is the connection. Nice algorithm, by the way.
  • Ankit Roy
    Ankit Roy over 11 years
    @becko: Just noticed a bug! y(i) should be sum(max(x(j), y(j))) for each child j of i, since we only want to allow, not require, that the children are included in the independent set.