[CS Basics] Trees and Tree traversal
Tree algorithms are very common in many Computer Science problems and by knowing the basic concepts and algorithms you can apply them to solve specific problems.
Let’s start to define what a Tree is. According to 1, a tree is:
[…] a data structure that stores elements hierarchically. With the exception of the top element, each element in a tree has a parent element and zero or more children elements.
Some examples of trees could be:
- Your own family tree
- Your company’s organizational chart
- A filesystem
And here a graphical example:
graph TD;
1 --> 2
1 --> 3
1 --> 4
2 --> 5
2 --> 6
3 --> 7
3 --> 8
4 --> 9;
4 --> 10;
10 --> 11;
Definitions
- Leaf: a
node
without children - Height: the number of levels until the
leaf
at the very bottom
In the previous example,
- Leafs:
5, 6, 7, 8, 9, 11
- Height:
3
Traversal
As if we had an array, we can “iterate” through the tree nodes, called tree traversal. There are two common ways to do it:
- Depth-First Search (DFS)
- Breadth-First Search (BFS)
DFS
As it name suggests, depth first search visits all the nodes from the top to the leafs, branch by branch, i.e. it selects a branch and goes deeper until reach a leaf, then it selects the next branch and so on. In the previous example, a DFS starting from 1 will visit the following nodes in order:
Iterative algorithm
This kind of visit can be implemented using a stack, performing the following actions (with the assumption you start at the root):
- Append the
root
to the stack - While the stack is not empty:
- Pop the
node
at the top of the stack - Perform any action to the
node
- Push all the
children
ofnode
in the stack (in reverse order for an ordered visits) - Repeat the process
- Pop the
Recursive algorithm
A recursive algorithm could be implemented as follows:
DFS(node):
- Perform any action to
node
- for each
node
innode.children
:DFS(node)
DFS(root)
BFS
Also known as level order traversal. Instead of exhausting branch by branch as in DFS, BFS goes through all the branches in each “step” by visiting all the nodes on the same level and goes doing that until it reaches the last level, i.e. when all the nodes on the current level are leafs.
In the previous example, BFS would visit the nodes in the following order:
Level 0: 1
Level 1: 2 -> 3 -> 4
Level 2: 5 -> 6 -> 7 -> 8 -> 9 -> 10
Level 3: 11
Algorithm
- Create an empty
queue
- Append to the
queue
theroot
- While the
queue
is not empty:- Pop the
node
at the front of thequeue
- Perform any action to the
node
- Enqueue (append) all the
children
ofnode
to the queue - Repeat the process
- Pop the
Code example
There are many ways to code a tree, depending on the type of tree you will be using you can store it in an array, using linked lists or simple chained objects.
I put an example of an object oriented approach to code a tree datastructure, using python:
from collections import namedtuple
from collections import deque
Node = namedtuple('Node', 'value, children')
def dfs(node, callback):
callback(node)
if not node.children: return
for child in node.children: dfs(child, callback)
def dfs_iterative(root, callback):
stack = [root]
while stack:
node = stack.pop()
callback(node)
if node.children:
for child in reversed(node.children): stack.append(child)
def bfs(root, callback):
queue = deque([root])
while queue:
node = queue.popleft()
callback(node)
if node.children:
for child in node.children: queue.append(child)
Instead of creating a class, I wrapped the Node data structure into a namedtuple2 and implemented the previous mentioned algorithms with the help of the Python’s collection deque API.
To see the code running I created two online notebooks:
References
-
Goodrich, M. T., & Tamassia, R. (2015). Algorithm Design and Application. Wiley. ↩
-
https://docs.python.org/3.5/library/collections.html#collections.namedtuple ↩