# Binary trees¶

Let us consider a list of $n$ different real numbers and we want to find a given number in this list. If we iterate over the elements to find a particular one then it takes as much comparisons as the length of the list at most. So the necessary number of operations is a linear function of $n$, that is it can be estimated above by a constant times $n$. We express it with saying that the time complexity of this algorithm is linear, or $O(n)$ (this is the so called big O notation)

But you can do better with the binary search if the list is already sorted. The times complexity of it is $O(\log(n))$.

Binary search

• The list is sorted increasingly. Look at the middle term of the list and compare it with the searched number.
• If the middle is smaller, then the searched number must be in the second half of the list.
• In this case continue your search in this half of the list.
• If the middle is greater, then the searched number must be in the first half.
• In this case continue your search in this half of the list.
• If they are equal, then you found it.

stderr (short for standard error) is the default file descriptor where a process can write error messages. In jupyter we can separate the stderr messages from the stdout (short for standard output) messages. The background color of stdout messages is different. We will use it to trace midle term indexes. Writing to stdout we import the sys module.

This search takes at most ceiling (upper integer part) of $\log_2(n)$ steps, since the list get halved at every step.

In this algoritm the numbers have to be pre-sorted. A question: how could we manage this list if we can insert or remove some elements.

Let us write a class for storing the elements in a sorted list. The class will keep the order of the elements by inserting a new element to its correct place. (There is no problem when we remove an element.)

## Binary trees¶

A binary tree is a directed graph which has a root (a vertex with 0 indegree) and where every node have at most two outgoing edges (that is the outdegree is 0, 1 or 2 for every vertex). These 1 or 2 neighbours sometimes called child of the node.

We will implement a tree for storing numbers keeping the order in a similar way than in the previous ordered list.

This implementation keeps the tree sorted in a way that smaller numbers always go to the left and greater numbers to the right.

When we initialize a new tree, we add two children to it with the value None. If you want to insert an elements then do the following.

• If the new number is smaller then the current node then you should insert it to the left
• If the left child is None then insert a new node there.
• If the left subtree is not empty, then call the insert method to that node recursively.
• If the new number is greater than the current node, then do the same for the right subtree.
• If they are equal then you don't have to do anything because the number was already in the tree.

No we add a to_list() method which retrieves the elements in order.

• Concatenate the result of to_list on the left subtree (recursive)
• Append the current element
• Concatenate the result of to_list on the right subtree (recursive)

The result is the list of ordered elements.

       5
/ \
4   7
/
1

This is good for storing the elements ordered and also you can find and add elements with $O(\log(n))$ time complexity.

Although this technique is slow if the tree is very uneven (unbalanced). The actual time complexity is depending on the depth (height) of the tree. The tree will be very unbalanced if the element come in order.

        5
/
4
/
3
/
2
/
1

There are self-balancing binary trees (AVL trees) and you will learn about them on Algorithm theory class.

Now we implement the find(data) method, which finds the subtree with data as a root. It is quite straight forward.

       5
/ \
3   6
/ \
1   4
\
2

How could we delete a node? It is simple if the node is a leaf (has no child), or if it has one child only: delete it and substitute the child if it has at all. If it has two children, we search the smallest value greater than this node. Delete that smallest value from the right subtree, and write this value into the place of the node.

Python tutor

       5               5
/               /
3        =>     4
/ \             /
1   4           1
    5            6           8          8       8      4
/ \          / \         / \        /       /
4   8   =>   4   8   =>  4   9  =>  4   =>  4   =>
/   / \      /     \     /          /
1   6   9    1       9   1          1

## An application¶

A very useful application of binary trees are expression trees where the nodes are operations and the leaves are numbers.

If a node is an operation, then store the operation in self.data and the two branches are the two operands. The leaves are numbers without outgoing edges.

(7*(2+3))+(-3)

                 +          x0
/ \         / \
*  -3      x1   x4
/ \         / \
7   +      x2   x3
/ \          / \
2   3       x5   x6