**Under construction!**Website upgrade pending. Some features / pages may be missing.

# Aladdin And The Return Journey

We are given a tree with $$N$$ nodes ($$1 \leq N \leq 30,000$$). On each node $$i$$ there is a number of "genies" $$v_i$$ ($$0 \leq v_i \leq 1000$$). For a pair of nodes $$i$$ and $$j$$, let $$sum(i,j)$$ denote the total number of genies on all nodes on the path from $$i$$ to $$j$$ (inclusive).

We would like to process a series of queries (at most $$100,000$$ queries will be given). There are two types of queries:

- "Update $$i,v$$": Set $$v_i$$ (the number of genies on node $$i$$) to the new value $$v$$
- "Count $$i,j$$": Print the total number of genies on all nodes on the path from node $$i$$ to node $$j$$ (inclusive). That is, print $$sum(i,j)$$

- Tree Problem
- Make into a rooted tree. Pick an arbitrary root.
- Queries ==> data structures
- Paths on a tree.
- Lowest Common Ancestor (LCA) Data Structure
- Heavy-Light Decomposition
- Fenwick Trees (Binary Indexed Trees)
- Fenwick Tree + Heavy-Light Decomposition
- DFS Ordering (Pre-Order Traversal) Array

Suppose we pick an arbitrary node to be the root of the tree (without loss of generality, let's choose node 0). Now, consider a pair of nodes $$i$$ and $$j$$. Then there is some node $$c$$ such that the path from $$i$$ to $$j$$ is the concatenation of the path from $$i$$ to $$c$$ and from $$c$$ to $$j$$, where $$c$$ is an "ancestor" of both $$i$$ and $$j$$ (according to the rooted tree).

Such a node $$c$$ is called the "lowest common ancestor" (or *LCA*) of $$i$$ and $$j$$.

Many problems on trees -- such as the current problem -- relate to "paths" in trees. Whenever we have such a problem on a rooted-tree, it is often beneficial to break down paths into "simple" paths from nodes to ancestors only (i.e.: paths that only travel directly down the tree, or directly up the tree, but do not otherwise "bend".)

These kinds of simpler paths are useful mostly because there are many data structures and techniques that exploit the "parent-child" relationship or the "ancestor-descendant" relationship. For example, I know of at least two techniques that allows one to efficiently "update all descendants of a node" or "update all ancestors of a node" (I will actually describe them below because they can be used to solve this problem). Lastly, there is a very efficient data structure to compute the "Lowest Common Ancestor" of any two nodes of a rooted tree (Google "Lowest Common Ancestor" and you should find out about it; I believe this was invented by Robert Tarjan!) And it is not too hard to see that the Key Observation from above is true for all rooted trees: every path from a node $$i$$ to node $$j$$ can be broken down into a path from $$i$$ to $$c$$ and a path from $$c$$ to $$j$$, where $$c = lca(i,j)$$.

Altogether, the existence of these techniques and data structures makes it *highly* desirable to convert our original definition of a path into equivalent definitions using the LCA.

This is something you learn from experience. However, let's call it another learning point!

Let's take this idea one step further. But first, some intuition.

Suppose we were only given an array of genies (rather than a tree). How would we solve our original problem?

Well, what we could do is compute the "Prefix Sum" array. That is, if we have an array $$A[0],A[1],...,A[N-1]$$ and we want to be able to quickly compute the sum $$A[i] + A[i+1] + ... + A[j]$$ for some pair $$i,j$$, the we can create a new array $$C$$ defined so that: \[C[0] = A[0]\] \[C[1] = A[0] + A[1]\] \[C[2] = A[0] + A[1] + A[2]\] \[...\] \[C[N-1] = A[0] + A[1] + A[2] + ... + A[N-1]\] $$C$$ is called the "Prefix Sum" array. Then, to compute any sum $$A[i] + A[i+1] + ... + A[j]$$, we simply take: \[C[j] - C[i-1]\] Although there are a couple of corner cases to handle, this works in general, and gives us an $$O(1)$$-time method to quickly find the sum of any contiguous sub-array of the original array $$A$$. In particular, we pre-compute the answers for queries $$A[0],...,A[i]$$ and we answer the questions for $$A[i],...,A[j]$$ by subtracting particular entries of the pre-computed data.

In the case of our given problem, we are not given an array, but we are given a tree. And we are not concerned about summing contiguous sub-arrays, but we instead want to find the sum of numbers along a path. From our earlier arguments, we only have to worry about paths of the form $$i,c$$ where $$c$$ is an ancestor of $$i$$. Now, is there any way to "generalize" the "Prefix Sum" array idea from arrays to trees like this?

In this case, the answer is *YES*! We have the following Key Observation.

The above lemma basically says that, if we can quickly compute $$sum(i,0)$$ for all nodes $$i$$, then we can quickly compute $$sum(i,c)$$ for any node $$i$$ and ancestor $$c$$. This is a tree-like analogue to using Prefix Sums for arrays.

**this section is crucial**. If you are a beginner, please read the previous section first.

In the last Idea Section, we saw how to model this problem using the Lowest Common Ancestor (LCA). In particular, to answer the query $$sum(i,j)$$ it suffices to answer the queries $$sum(i,c)$$ and $$sum(c,j)$$ where $$c$$ is the LCA of $$i$$ and $$j$$. We took this one step further and noted that, to answer $$sum(i,c)$$, it suffices to know $$sum(i,0)$$ and $$sum(c,0)$$.

In this section, we show how to efficiently answer the latter question: How do we quickly answer queries of the form $$sum(x,0)$$ for any node $$x$$.

First, notice that, if the problem were "static" (i.e.: the numbers were not changing), we could probably solve this by performing a depth first search (dfs) at the beginning, and pre-computing the answers. In particular, at any point in the dfs, if we are at node $$i$$, we would keep track of the sum of the number of genies along the path from the root (node $$0$$) to node $$i$$. When we traverse another edge, we add $$v_i$$ to the sum, and when we "backtrack" out of the edge, we subtract $$v_i$$. This traversal would maintain the correct number at each node.

Although this problem is not static, we can still draw a great deal of intuition from this special case. In particular, we can still (theoretically) perform this dfs at the beginning. Then, when an "Update" query comes, we notice that we *only have to update the descendants of the changing node.*

(See the Related Problems section at the bottom of this page.)

We formalize this with the following key observation.

Suppose we were to pre-compute the answer $$sum(i,0)$$ for each node $$i$$. When an "Update $$i,v$$" query is given (to change the number of genies on node $$i$$ to become value $$v$$), we notice that $$sum(x,0)$$ will only change for those nodes $$x$$ who are descendants of $$i$$. All other nodes will have their sum value stay the same. Moreover, the value of those $$x$$ that are children of $$i$$ changes uniformly (i.e.: by the same amount) -- exactly by the amount: $$v - v_i$$ (i.e.: the new value of node $$i$$, minus the old value). Note, the term "descendant" also includes $$i$$ itself (that is, $$i$$ is a descendant of itself, technically).

Suppose we have the array $$A$$ representing the Pre-Order-Traversal array. Let $$B$$ be another array so that $$B[k]$$ denotes the current known value of $$sum(A[k],0)$$. That is, let $$B[k]$$ denote the current sum of all genies from the $$k^{th}$$ node of the Pre-Order-Traversal, up to the root of the tree. Then, originally, we can compute $$B[k]$$ (by some kind of depth first search maybe?). And whenever an "Update $$i,v$$" query comes in, we only need to update a contiguous range of the $$B$$ array.

We can use Fenwick Trees to update a contiguous range of an array. By constructing the Fenwick Tree for the array $$B$$, and by knowing which node corresponds to which index (according to the Pre-Order-Traversal array $$A$$), this allows us to quickly answer the question we were interested in!!!

We are given a tree $$T$$, an initial value $$v_i$$ for each node $$i$$, and a series of queries of the forms: "Update $$i,v$$" and "Count $$i,v$$" (as described in the problem statement). Here is the final pseudocode:

```
# Pre-define some functions
Initialize(T):
Treat T as a rooted tree, and let node 0 be the root
Construct the LCA (lowest common ancestor) data structure on T
# PreOrderTraversal(i,A) populates the array A with the
# pre-order-traversal for the sub-tree rooted in node i
def PreOrderTraversal(i,A):
A.push(i)
for each child j of i:
PreOrderTraversal(j,A)
# lca(i,j) returns the lowest common ancestor of i and j
def lca(i,j):
...
# If B is an array, update(B,i,j,v) increments B[i..j] by value v (using a Fenwick Tree)
def update(B,i,j,v);
...
### MAIN ALGORITHM BEGINS HERE
solve(T,v[]):
let N = |T| #(the number of nodes in T)
let A[0..N-1] be an array
let B[0..N-1] be an array
# initialization
call Initialize(T)
call PreOrderTraversal(0,A)
for all i = 0..N-1
Set B[i] = v[i]
for each query:
if query is type "Update i,new_v":
Let L be the index (integer) so that A[L] = i (the index corresponding to node i in A)
Let R be the largest index such that A[R] is a descendant of i
# Note: A[L..R] now precisely contains the sub-tree rooted in i
Let v_change := v[i] - new_v
update(B,L,R, v_change)
Set v[i] = new_v
else (query is type "Count i,j"):
Let c := lca(i,j)
Let sum_ic = B[i] - B[c] + v[c] # sum(i,c) = sum(i,0) - sum(i,c) + v[c]
Let sum_jc = B[j] - B[c] + v[c] # sum(j,c) = ... (similarly)
Let ans = sum_ic + sum_jc - v[c] # sum(i,j) = sum(i,c) + sum(j,c) - v[c]
Print (ans)
```

Calling $$solve(T,v)$$ should solve the problem.
Just a quick note on the time/space complexity: all of the data structures use $$O(N)$$ memory altogether (maybe $$O(N \lg N)$$ at worst, depending on how you implement the $$B$$ array / data structure). Each query takes $$O(\lg(N))$$ time in total: $$O(\lg N)$$ to find $$lca(i,j)$$ (if necessary), and $$O(\lg N)$$ to update or access the $$B$$ array (assuming $$B$$ is implemented with a Fenwick Tree).

Hence, the overall algorithm takes $$O((N+Q)\lg N)$$ where $$N$$ is the number of nodes and $$Q$$ is the number of queries.

This is a beautiful illustration of many Data Structures on Trees. To solve this problem, you need knowledge of:

- The Lowest Common Ancestor data structure
- Prefix Sums and Range Queries
- Pre-Order-Traversal
- Fenwick Trees

These ideas allow you to (eventually) reduce the problem into a "range query" problem on a simple array. There are quite a few non-trivial "reductions" (problem transformations) needed to get there, but in the end, it is quite a neat solution!

First, we assume the tree is rooted, and that node $$0$$ is the root of the tree. This makes the problem easier to conceptualize and solve (because there are many data structures and techniques to solve problems on rooted trees).

The next "reduction" is to notice that, for rooted trees, the query $$sum(i,j)$$ (finding the number of genies on the path from $$i$$ to $$j$$) can be reduced to finding the answer to queries $$sum(i,c)$$ and $$sum(j,c)$$, where $$c$$ is an "ancestor" of both $$i$$ and $$j$$ (it is actually the "Lowest Common Ancestor" or $$lca(i,j)$$). These types of queries are easier to handle (because they are always concerning parents and children, or ancestors and descendants, rather than arbitrary nodes / paths). And the well-known LCA data structure makes it a routine task to find $$c$$ whenever we need to!

The precise formula is: \[sum(i,j) = sum(i,c) + sum(c,j) - v_c\] (where $$v_c$$ is the current number of genies on node $$c$$). The last subtraction of $$v_c$$ is to avoid double-counting it.

Next, we notice that queries of the form $$sum(i,c)$$ (where $$c$$ is an ancestor of $$i$$) can be reduced to finding the queries $$sum(i,0)$$ and $$sum(c,0)$$, where $$0$$ is the root of the tree. There are only $$O(N)$$ queries of the form $$sum(i,0)$$, so this is much more manageable.

The precise formula is:

\[sum(i,c) = sum(i,0) - sum(c,0) + v_c\]So, putting together the above two formulas, we can see that all queries $$sum(i,j)$$ can be solved by: \[sum(i,j) = sum(i,0) + sum(j,0) - 2sum(c,0) + v_c\] (where this formula only depends on queries from nodes to the root.)

So we are only concerned with answering queries of the form $$sum(i,0)$$. Now, let's pretend we already know $$sum(i,0)$$ for all nodes $$i$$. If an "update" query comes on some node $$i$$ for some value $$v$$, which of the answers will change? Well, for a node $$x$$, if $$x$$ is a child or descendant of node $$i$$, then $$sum(x,0)$$ will change (because $$i$$ is on the path from $$x$$ to $$0$$). And if $$x$$ is NOT a descendant of $$i$$, then $$sum(x,0)$$ will NOT change. So, when an update query comes, we want to update precisely all of the nodes in the sub-tree of $$i$$. And when a "count" query comes, we just want to lookup the answer for the given node.

We can actually do this fairly efficiently. Consider the "Pre-Order-Traversal" array $$A$$ of the tree. That is, $$A$$ is a "permutation" of the numbers $$0$$ through $$N-1$$ defined recursively by a depth-first-search.

The key observation about the array $$A$$ is that, for a given node $$i$$, all of its descendants will appear consecutively (next to each other) in this array. So, when an "update" query comes in, since we would like to update all descendants of the node $$i$$, it is equivalent to updating all the nodes in a contiguous sub-array of $$A$$. So we do precisely that. We construct another array $$B[ ]$$ so that $$B[k] := sum(A[k],0)$$ (the number of genies on the path from node $$A[k]$$ to node 0). Whenever we want to update some node $$i$$, we find its index in $$A$$, and we find the indices of all its children in $$A$$. This defines a range $$L,R$$ of indices. We then increment/decrement the corresponding values in the $$B$$ array: $$B[L..R]$$ with a range update.

Lastly, in order to efficiently perform these range updates and range queries on the $$B$$ array, we must avoid "explicitly" storing the data. Instead, we construct a Fenwick Tree to represent $$B$$. A Fenwick Tree can support range updates and single-point queries (which is precisely what we need) in $$O(\lg N)$$ time. This is sufficient.

As long as we can maintain the $$B[ ]$$ array with a Fenwick Tree, we can answer a query $$sum(i,j)$$ using the formula above: \[sum(i,j) = B[i] + B[j] - 2*B[c] + v_c\] where $$c := lca(i,j)$$ and $$B[ ]$$ can be computed from the Fenwick Tree. This yields an overall $$O(N \lg N)$$ solution.

**Assume the tree is rooted at node $$0$$.**Without loss of generality.**For a pair of nodes $$i$$ and $$j$$, let $$c$$ denote $$lca(i,j)$$. Then $$sum(i,j) = sum(i,c) + sum(c,j) - v_c$$.**This shows that we can reduce queries on arbitrary paths to queries on "ancestor-descendant" paths.**For a pair of nodes $$i$$ and $$c$$ (where $$c$$ is an ancestor of $$i$$). Then $$sum(i,c) = sum(i,0) - sum(c,0) + v_c$$.**This shows that we can reduce*all*queries into those dealing with the root node.**Suppose we somehow "store" $$sum(i,0)$$ for all nodes $$i$$. When an "update" query comes for a specific node, we have to update exactly those nodes that are descendants of the changing node. Moreover, we increment/decrement all of these nodes by the same amount.**This can be seen by inspection. If $$v_i$$ is changing, then $$sum(x,0)$$ changes for some node $$x$$ if and only if $$x$$ is in the sub-tree of $$i$$.**Consider the Pre-Order-Traversal array of the tree. For any node $$i$$, all the descendants of $$i$$ (including $$i$$ itself) will appear consecutively in this array.**This makes the problem of "updating a sub-tree" into the problem of "updating a contiguous sub-array". The latter problem is much easier to solve.**Suppose we "relabel" the nodes according to their Pre-Order-Traversal, and we construct an array $$B$$ so that $$B[i] = sum(i,0)$$. Then to "update" a node, we simply have to update a contiguous range in $$B$$. This can be done with a Fenwick Tree.**This was the last step in the algorithm. Above we do not "relabel" the nodes, but instead of use new labelling (according to the Pre-Order-Traversal) as indices into the $$B$$ array (these are equivalent definitions).**The entire problem can be solved in $$O(\lg N)$$ time per query.**

- This problem requires a lot of prior knowledge. We needed to recognize that we should transform the tree into a rooted tree. We need to know about the Lowest Common Ancestor, Fenwick Trees, Prefix Sums, and Pre-Order-Traversal Array -- all non-trivial data structures. This was not an easy problem to solve without knowing about all of these things.
- Turning the tree into a rooted tree is an example of a problem transformation.
- Arbitrary queries $$sum(i,j)$$ were too hard to solve. Instead we repeatedly simplified our queries. By using the LCA of $$i$$ and $$j$$ (denoted by $$c$$), we could simplify $$sum(i,j)$$ into queries of the form $$sum(i,c)$$ and $$sum(j,c)$$. Taking this one step further, we were able to simplify all the queries into forms $$sum(i,0)$$, $$sum(j,0)$$ and $$sum(c,0)$$. This made our task much easier to solve.
- Although not explicitly stated above, it is helpful to clearly write down the formulas that we get after each simplification. This helps to avoid bugs later on!

**The First Rule of Tree Problems: Pick an arbitrary root!**This appears very often because "rooted" trees have much more structure and symmetry than un-rooted trees. Often-times the root doesn't matter. Sometimes the root matters, or sometimes you have to "try" all nodes as the root.**Whenever you encounter problems that relate to "paths on a rooted tree", it is very likely that the LCA data structure will be useful.**Why? Because every path from $$i$$ to $$j$$ can be partitioned into two paths, one from $$i$$ to $$c$$ and one from $$c$$ to $$j$$ where $$c$$ is the LCA of $$i$$ and $$j$$. And these kinds of paths are easier to deal with.**Whenever you encounter problems that relate to querying/updating "sub-trees of a rooted tree", it might be possible that the Pre-Order-Traversal Array will be useful!**Why? Because the Pre-Order-Traversal Array has the property that all nodes in a given sub-tree occur consecutively in the array. This makes it easier to perform range-updates, range-queries, and even "splitting" and "merging" sub-trees (i.e.: if we treat them like linked lists)! Arrays are much easier to deal with than trees (see some of the Related Problems below).

- Trees / Querying Paths on a Tree / Lowest Common Ancestor (LCA)
- Trees / Querying Sub-Trees of a Tree / DFS Ordering / Pre-Order-Traversal
- Data Structures / Range Queries / Fenwick Trees and Segment Trees
- Prefix Sums / Prefix Sums on a Tree / Sums of Ancestor Paths
- Range Queries on a Tree / Heavy-Light Decomposition (HLD)