Lecture 10 - Implementing a Tree in an Array

4. Implementing a Tree in an Array

How can we represent an arbitrary binary tree in an array? In fact, there are numerous ways to do this, we'll just look at one.

Because an array's length is fixed at compile time, if we use an array to implement a tree we have to set a limit on the number of nodes we will permit in the tree. Our strategy is to fix the maximum height of the tree (H), and make the array big enough to hold any binary tree of this height (or less). We'll need an array of size (2**H)-1. Here is the biggest binary tree of depth 3:

If we picked H=3 as our limit, then every tree we might build will be a subtree of this one - this is the key insight behind our implementation.

What we do now is assign each of nodes to a specific position in the array. This could be done any way you like, but a particular easy and useful way is:

root of the tree (A): array position 1
root's left child (B): array position 2
root's right child (C): array position 3
...
left child of node in array position K: array position 2K
right child of node in array position K: array position 2K+1

So D is in position 2*2 (4), E is in position 2*2+1 (5), F is in position 2*3 (6), G is in position 2*3+1 (7).

This figure shows the array position associated with each node:

This particular arrangement makes it easy to move from a node to its children, just double the node's index (and add 1 to go right). It also makes it easy to go from a node to its parent: the parent of node I has index (I div 2).

Using this strategy, a tree with N nodes does not necessarily occupy the first N positions in the array. For example, the tree:

Somehow we need to keep track of which array elements contain valid information. Two possibilities:

as usual, each node stores information saying which of its children exist
each array position stores information saying if it is a valid node.

However, if we restrict ourselves to complete trees, these problems go away. Because of the way we assigned nodes to positions, if there are N nodes in a complete tree, they will correspond to the first N positions of the array. So:

only need to keep track of N in order to know which array positions contain valid information.
if we add a new value, it must go in position N+1; if we delete a value, we must re-organize the tree so that the `gap' created by deleting is filled.
can traverse the tree by going through the array from first to Nth position.
```
      for(i=0;i<N;i++) { process node in position i; }
```
This gets us level-by-level traversal!
To see this `amazing' fact, look again at the earlier picture.

We have insisted that heaps be complete trees precisely so that they will have this very nice implementation in arrays. It is a useful exercise to work through the insert and delete operations for heaps in this array representation: the textbook gives code implementing INSERT and DELETE for this representation ((pp.279-80)).