Implementing a Binary Tree

Article
07/12/2006

The following sections discuss in some detail how the code that implements the binary tree class operates. You'll find the code for this section in TREE.CLS, TREEITEM.CLS, and TREETEST.BAS.

The TreeItem Class

As with the structure items in the previous sections, the TreeItem class is simple. It includes just the three necessary data items: the value to be stored at the current node, the pointer to the left child node, and the pointer to the right child node.

Public Value As Variant
Public LeftChild As TreeItem
Public RightChild As TreeItem

Of course, there's nothing stopping you from storing more information in the TreeItem class. For example, you may need to write a program that can parse a text file, create a binary tree containing all the distinct words in the file, and store each word in its own node, along with a list of all the page numbers on which that word occurred. In this case, you might want to store a pointer to a linked list in the TreeItem class, along with the text item. That linked list could store the list of all the page numbers on which the word was found.

The Tree Class

As with the previous data structures, the base Tree class stores the bulk of the code required to make the data structure work. The class contains but a single data item:

Dim tiHead As TreeItem

As with the other data structures, tiHead is an anchor for the entire data structure. It points to the first item in the binary tree, and from there, the items point to other items.

In addition, the Tree class module contains two module variables:

' These private variables are used when
' adding new nodes.
Private mfAddDupes As Boolean
Private mvarItemToAdd As Variant

The method that adds items to the binary tree uses these global variables. If they weren't global, the code would have to pass them as parameters to the appropriate methods. What's wrong with that? Because the Add method is recursive, the procedure might call itself many times. Each call takes up memory that isn't released until the entire procedure has completed. If your tree is very deep, you could eat up a large chunk of stack space adding a new item. To avoid that issue, the Tree class doesn't pass these values as parameters; it just makes them available to all the procedures, no matter where they're called.

Adding a New Item

When adding items to a binary tree, you may or may not want to add an item if its value already appears in the data structure. To make it easy to distinguish between those two cases, the Tree class contains two separate methods, Add and AddUnique, shown in Listing 6.13. Each of the methods ends up calling the AddNode procedure, shown in Listing 6.14.

Listing 6.13: The Tree Class Provides Two Ways to Add New Items

Public Sub Add(varNewItem As Variant)
    ' Add a new node, allowing duplicates.
    ' Use module variables to place as little as
    ' possible on the stack in recursive procedure calls.
    mfAddDupes = True
    mvarItemToAdd = varNewItem
    Call AddNode(tiHead)
End Sub
Public Sub AddUnique(varNewItem As Variant)
    ' Add a new node, skipping duplicate values.
    ' Use module variables to place as little as
    ' possible on the stack in recursive procedure calls.
    mfAddDupes = False
    mvarItemToAdd = varNewItem
    Call AddNode(tiHead)
End Sub

The recursive AddNode procedure adds a new node to the binary tree pointed to by the TreeItem pointer it receives as a parameter. Once you get past the recursive nature of the procedure, the code is trivial:

If the TreeItem pointer, ti, is Nothing, it sets the pointer to a new TreeItem and sticks the value into that new node.
If the pointer isn't Nothing, then:
- If the new value is less than the value in ti*,* the code calls AddNode with the left child pointer of the current node.
- If the new value is greater than the value in ti, the code calls AddNode with the right child pointer of the current node.
- If the new value is equal to the current value, then, if you've instructed the code to add duplicates, the code arbitrarily calls AddNode with the right child pointer. (You could use the left instead, if you wanted.) If you don't want to add duplicates, the procedure just returns.
- Sooner or later, after calling AddNode for each successive child node, the code will find a pointer that is Nothing, at which point it takes the action in the first step. Because nothing follows the recursive call to AddNode in the procedure, after each successive layer has finished processing, the code just works its way back up the list of calls.

Listing 6.14: The Recursive AddNode Procedure Adds a New Node to the Tree

Private Function AddNode(ti As TreeItem) As TreeItem
    ' Add a node to the tree pointed to by ti.
    ' Module variables used:
    '    mvarItemToAdd: the value to add to the tree.
    '    mfAddDupes: Boolean indicating whether to add items
    '      that already exist or to skip them.
    If ti Is Nothing Then
        Set ti = New TreeItem
        ti.Value = mvarItemToAdd
    Else
        If mvarItemToAdd < ti.Value Then
            Set ti.LeftChild = AddNode(ti.LeftChild)
        ElseIf mvarItemToAdd > ti.Value Then
            Set ti.RightChild = AddNode(ti.RightChild)
        Else
            ' mvarItemToAdd = ti.Value
            ' You're adding a node that already exists.
            ' You could add it to the left or to the right,
            ' but this code arbitrarily adds it to the right.
            If mfAddDupes Then
                Set ti.RightChild = AddNode(ti.RightChild)
            End If
        End If
    End If
End Sub

Adding a New Node: Walking the Code

Suppose you were to try adding a new node to the tree shown in Figure 6.35 with the value “m”. Table 6.2 outlines the process involved in getting the node added. (This discussion assumes that the class module's tiHead member points to the tree shown in Figure 6.35.) For each step, the table includes, in column 1, the recursion level—that is, the number of times the procedure has called itself.

Figure 6.35: Revisiting the alphabetic tree, attempting to add a new node

Table 6.2: Recursive Steps to Add “m” to the Sample Tree

Level	Action
0	You call the Add method, passing the value “m”
0	The Add method sets fAddDupes to True and sets varNewItem to the value “m”. It then calls the AddNode method, passing the pointer to the first item in the tree (a node with the value “f”, in this case) [Call to Level 1]
1	AddNode checks to see whether ti is Nothing. It's not (It points to the node containing “f”.)
1	Because “m” is greater then “f”, AddNode calls itself, passing the right child pointer of the node ti currently points to (That is, it passes a pointer to the node containing “i”.) [Call to Level 2]
2	AddNode checks to see whether ti is Nothing. It's not (It points to the node containing “i”.)
2	Because “m” is greater then “i”, AddNode calls itself, passing the right child pointer of the node ti currently points to (That is, it passes a pointer to the node containing “k”.) [Call to Level 3]
3	AddNode checks to see whether ti is Nothing. It's not (It points to the node containing “k”.)
3	Because “m” is greater then “k”, AddNode calls itself, passing the right child pointer of the node ti currently points to (that is, the right child pointer of the node containing “k”, which is Nothing) [Call to Level 4]
4	AddNode checks to see whether ti is Nothing. It is, so it creates a new node, sets the pointer passed to it (the right child of the node containing “k”) to point to the new node, and returns
4	There's nothing else to do, so the code returns [Return to Level 3]
3	There's nothing else to do, so the code returns [Return to Level 2]
2	There's nothing else to do, so the code returns [Return to Level 1]
1	The code returns back to the original caller

Traversing the Tree

As mentioned earlier in this discussion, there are three standard methods for traversing a tree: inorder, preorder, and postorder. Because of the recursive nature of these actions, the code for each is simple; it is shown in Listing 6.15. The class provides three public methods (WalkInOrder, WalkPreOrder, WalkPostOrder). Each calls a private procedure, passing to it the pointer to the head of the tree. From then on, each of the private procedures follows the prescribed order in visiting nodes in the tree.

Of course, in your own applications, you'll want to do something with each node besides print its value to the Debug window. In that case, modify the three private procedures to do what you need done with each node of your tree.

Listing 6.15: Because of Recursion, the Code to Traverse the Tree Is Simple

Public Sub WalkInOrder()
    Call InOrder(tiHead)
End Sub
Public Sub WalkPreOrder()
    Call PreOrder(tiHead)
End Sub
Public Sub WalkPostOrder()
    Call PostOrder(tiHead)
End Sub
Private Sub InOrder(ti As TreeItem)
    If Not ti Is Nothing Then
        Call InOrder(ti.LeftChild)
        Debug.Print ti.Value; " ";
        Call InOrder(ti.RightChild)
    End If
End Sub
Private Sub PreOrder(ti As TreeItem)
    If Not ti Is Nothing Then
        Debug.Print ti.Value; " ";
        Call PreOrder(ti.LeftChild)
        Call PreOrder(ti.RightChild)
    End If
End Sub
Private Sub PostOrder(ti As TreeItem)
    If Not ti Is Nothing Then
        Call PostOrder(ti.LeftChild)
        Call PostOrder(ti.RightChild)
        Debug.Print ti.Value; " ";
    End If
End Sub

Share via