Binary Search Trees Understand tree terminology Understand and - - PowerPoint PPT Presentation

binary search trees understand tree terminology
SMART_READER_LITE
LIVE PREVIEW

Binary Search Trees Understand tree terminology Understand and - - PowerPoint PPT Presentation

Binary Search Trees Understand tree terminology Understand and implement tree traversals Define the binary search tree property Implement binary search trees Implement the TreeSort algorithm October 2004 John Edgar 2


slide-1
SLIDE 1

Binary Search Trees

slide-2
SLIDE 2

¡ Understand tree terminology ¡ Understand and implement tree traversals ¡ Define the binary search tree property ¡ Implement binary search trees ¡ Implement the TreeSort algorithm

October 2004 John Edgar 2

slide-3
SLIDE 3
slide-4
SLIDE 4

¡ A set of nodes (or vertices)

with a single starting point

§ called the root ¡ Each node is connected by

an edge to another node

¡ A tree is a connected graph § There is a path to every node

in the tree

§ A tree has one fewer edges

than the number of nodes

October 2004 John Edgar 4

slide-5
SLIDE 5

October 2004 John Edgar 5

yes! NO!

All the nodes are not connected

NO!

There is an extra edge (5 nodes and 5 edges)

yes! (but not

a binary tree)

yes! (it’s actually

the same graph as the blue one)

slide-6
SLIDE 6

A B C D G E F

¡ Node v is said to be a child

  • f u, and u the parent of v if

§ There is an edge between the

nodes u and v, and

§ u is above v in the tree,

¡ This relationship can be

generalized

§ E and F are descendants of A § D and A are ancestors of G § B, C and D are siblings § F and G are?

October 2004 John Edgar 6

root edge parent of B, C, D

slide-7
SLIDE 7

¡ A leaf is a node with no children ¡ A path is a sequence of nodes v1 … vn

§ where vi is a parent of vi+1 (1 ≤ i ≤ n-1)

¡ A subtree is any node in the tree along with all

  • f its descendants

¡ A binary tree is a tree with at most two children

per node

§ The children are referred to as left and right § We can also refer to left and right subtrees

October 2004 John Edgar 7

slide-8
SLIDE 8

October 2004 John Edgar 8

C A B C D G E F E F G D G A leaves: C,E,F,G path from A to D to G subtree rooted at B

slide-9
SLIDE 9

October 2004 John Edgar 9

A B C G D E left subtree

  • f A

H I J F right subtree of C right child of A

slide-10
SLIDE 10

¡ The height of a node v is the length of the

longest path from v to a leaf

§ The height of the tree is the height of the root

¡ The depth of a node v is the length of the path

from v to the root

§ This is also referred to as the level of a node

¡ Note that there is a slightly different formulation

  • f the height of a tree

§ Where the height of a tree is said to be the number of

different levels of nodes in the tree (including the root)

October 2004 John Edgar 10

slide-11
SLIDE 11

October 2004 John Edgar 11

A B C G D E H I J F A B E height of node B is ? height of the tree is ? depth of node E is ? level 1 level 2 level 3 2 3 2

slide-12
SLIDE 12
slide-13
SLIDE 13

October 2004 John Edgar 13

yes! yes!

However, these trees are not “beautiful” (for some applications)

slide-14
SLIDE 14

¡ A binary tree is perfect, if

§ No node has only one child § And all the leaves have the

same depth ¡ A perfect binary tree of

height h has how many

nodes?

§ 2h+1 – 1 nodes, of which 2h

are leaves

October 2004 John Edgar 14

A B C G D E F

slide-15
SLIDE 15

October 2004 John Edgar 15

12 22 31 23 24 33 34 35 36 38 01 11 21 32 37

l Each level doubles the number of nodes

l Level 1 has 2 nodes (21) l Level 2 has 4 nodes (22) or 2 times the number in Level 1

l Therefore a tree with h levels has 2h+1 - 1nodes

l The root level has 1 node

the bottom level has 2h nodes, that is, just over ½ the nodes are leaves

slide-16
SLIDE 16

¡ A binary tree is complete if

§ The leaves are on at most two

different levels,

§ The second to bottom level is

completely filled in, and

§ The leaves on the bottom

level are as far to the left as possible

¡ Perfect trees are also

complete

October 2004 John Edgar 16

A B C D E F

slide-17
SLIDE 17

¡ A binary tree is balanced if

§ Leaves are all about the same distance from the root § The exact specification varies

¡ Sometimes trees are balanced by comparing

the height of nodes

§ e.g. the height of a node’s right subtree is at most

  • ne different from the height of its left subtree

¡ Sometimes a tree's height is compared to the

number of nodes

§ e.g. red-black trees

October 2004 John Edgar 17

slide-18
SLIDE 18

October 2004 John Edgar 18

A B C F D E A B C F D E G

slide-19
SLIDE 19

October 2004 John Edgar 19

A B C D A B C E D F

slide-20
SLIDE 20
slide-21
SLIDE 21

¡ A traversal algorithm for a binary tree visits each

node in the tree

§ Typically, it will do something while visiting each node!

¡ Traversal algorithms are naturally recursive ¡ There are three traversal methods

§ Inorder § Preorder § Postorder

October 2004 John Edgar 21

slide-22
SLIDE 22

// InOrder traversal algorithm void inOrder(Node *n) { if (n != 0) { inOrder(n->leftChild); visit(n); inOrder(n->rightChild); } }

October 2004 John Edgar 22

C++

slide-23
SLIDE 23

InOrder Traversal

October 2004 John Edgar 23

A B C F D E

slide-24
SLIDE 24

// PreOrder traversal algorithm void preOrder(Node *n) { if (n != 0) { visit(n); preOrder(n->leftChild); preOrder(n->rightChild); } }

October 2004 John Edgar 24

C++

slide-25
SLIDE 25

October 2004 John Edgar 25

visit(n) preOrder(n->leftChild) preOrder(n->rightChild)

visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r) visit preOrder(l) preOrder(r)

slide-26
SLIDE 26

// PostOrder traversal algorithm void postOrder(Node *n) { if (n != 0) { postOrder(n->leftChild); postOrder(n->rightChild); visit(n); } }

October 2004 John Edgar 26

C++

slide-27
SLIDE 27

October 2004 John Edgar 27

postOrder(n->leftChild) postOrder(n->rightChild) visit(n)

postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit postOrder(l) postOrder(r) visit

slide-28
SLIDE 28
slide-29
SLIDE 29

¡ The binary tree can be implemented using

a number of data structures

§ Reference structures (similar to linked lists) § Arrays ¡ We will look at three implementations § Binary search trees (reference / pointers) § Red – black trees (reference / pointers) § Heap (arrays)

October 2004 John Edgar 29

slide-30
SLIDE 30

¡ Consider maintaining data in some order

§ The data is to be frequently searched on the sort key

e.g. a dictionary

¡ Possible solutions might be:

§ A sorted array

▪ Access in O(logn) using binary search ▪ Insertion and deletion in linear time

§ An ordered linked list

▪ Access, insertion and deletion in linear time

§ Neither of these is efficient

October 2004 John Edgar 30

slide-31
SLIDE 31

¡ The data structure should be able to perform all

these operations efficiently

§ Create an empty dictionary § Insert § Delete § Look up

¡ The insert, delete and look up operations should

be performed in at most O(logn) time

October 2004 John Edgar 31

slide-32
SLIDE 32

¡ A binary search tree (BST) is a binary tree

with a special property

§ For all nodes in the tree:

▪ All nodes in a left subtree have labels less than the label of the node ▪ All nodes in a right subtree have labels greater than

  • r equal to the label of the node

¡ Binary search trees are fully ordered

October 2004 John Edgar 32

slide-33
SLIDE 33

October 2004 John Edgar 33

slide-34
SLIDE 34

October 2004 John Edgar 34

inOrder(n->leftChild) visit(n) inOrder(n->rightChild)

inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r) inOrder(l) visit inOrder(r)

An inorder traversal retrieves the data in sorted order

slide-35
SLIDE 35

¡ Binary search trees can be implemented using a

reference structure

¡ Tree nodes contain data and two pointers to

nodes

October 2004 John Edgar 35

Node *leftChild Node *rightChild data pointers to Nodes data to be stored in the tree

slide-36
SLIDE 36

¡ To find a value in a BST search from the root

node:

§ If the target is less than the value in the node search its

left subtree

§ If the target is greater than the value in the node search

its right subtree

§ Otherwise return true, or return data, etc.

¡ How many comparisons?

§ One for each node on the path § Worst case: height of the tree + 1

October 2004 John Edgar 36

slide-37
SLIDE 37

¡ The BST property must hold after insertion ¡ Therefore the new node must be inserted in the

correct position

§ This position is found by performing a search § If the search ends at the (null) left child of a node

make its left child refer to the new node

§ If the search ends at the right child of a node make its

right child refer to the new node

¡ The cost is about the same as the cost for the

search algorithm, O(height)

October 2004 John Edgar 37

slide-38
SLIDE 38

October 2004 John Edgar 38

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 insert 43 create new node find position insert new node 43 43

slide-39
SLIDE 39

¡ After deletion the BST property must hold ¡ Deletion is not as straightforward as search or

insertion

§ So much so that sometimes it is not even

implemented!

§ Deleted nodes are marked as deleted in some way

¡ There are a number of different cases that must

be considered

October 2004 John Edgar 39

slide-40
SLIDE 40

¡ The node to be deleted has no children ¡ The node to be deleted has one child ¡ The node to be deleted has two children

October 2004 John Edgar 40

slide-41
SLIDE 41

¡ The node to be deleted has no children

§ Remove it (assigning null to its parent’s reference)

October 2004 John Edgar 41

slide-42
SLIDE 42

October 2004 John Edgar 42

63 41 10 7 12 54 79 37 44 53 59 96 57 91 97 delete 30 47 32 19 23 30

slide-43
SLIDE 43

¡ The node to be deleted has one child

§ Replace the node with its subtree

October 2004 John Edgar 43

slide-44
SLIDE 44

October 2004 John Edgar 44

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 79 replace with subtree

slide-45
SLIDE 45

October 2004 John Edgar 45

47 63 32 19 41 10 23 7 12 54 37 44 53 59 96 30 57 91 97 delete 79 after deletion

slide-46
SLIDE 46

¡ The node to be deleted has two children

§ Replace the node with its successor, the left most

node of its right subtree

▪ It is also possible to replace the node with its predecessor, the right most node of its left subtree

§ If that node has a child (and it can have at most one

child) attach it to the node’s parent

▪ Why can a predecessor or successor have at most one child?

October 2004 John Edgar 46

slide-47
SLIDE 47

October 2004 John Edgar 47

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 32 temp find successor and detach

slide-48
SLIDE 48

October 2004 John Edgar 48

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 32 37 temp temp find successor attach target node’s children to successor

slide-49
SLIDE 49

October 2004 John Edgar 49

47 63 32 19 41 10 23 7 12 54 79 44 53 59 96 30 57 91 97 delete 32 37 temp

  • find successor
  • attach target’s

children to successor

  • make successor

child of target’s parent

slide-50
SLIDE 50

October 2004 John Edgar 50

47 63 19 41 10 23 7 12 54 79 44 53 59 96 30 57 91 97 delete 32 37 temp note: successor had no subtree

slide-51
SLIDE 51

October 2004 John Edgar 51

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 63 temp

  • find predecessor*: note

it has a subtree

*predecessor used instead

  • f successor to show its

location - an implementation would have to pick one or the other

slide-52
SLIDE 52

October 2004 John Edgar 52

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 63 temp

  • find predecessor
  • attach predecessor’s

subtree to its parent

slide-53
SLIDE 53

October 2004 John Edgar 53

47 63 32 19 41 10 23 7 12 54 79 37 44 53 59 96 30 57 91 97 delete 63 59 temp temp

  • find predecessor
  • attach subtree
  • attach target’s

children to predecessor

slide-54
SLIDE 54

October 2004 John Edgar 54

47 63 32 19 41 10 23 7 12 54 79 37 44 53 96 30 57 91 97 delete 63 59 temp

  • find predecessor
  • attach subtree
  • attach children
  • attach pre.

to target’s parent

slide-55
SLIDE 55

October 2004 John Edgar 55

47 32 19 41 10 23 7 12 54 79 37 44 53 96 30 57 91 97 delete 63 59

slide-56
SLIDE 56

¡ The efficiency of BST operations depends on

the height of the tree

¡ All three operations (search, insert and delete)

are O(height)

¡ If the tree is complete the height is ⎣log(n)⎦ ¡ What if it isn’t complete?

October 2004 John Edgar 56

slide-57
SLIDE 57

¡ Insert 7 ¡ Insert 4 ¡ Insert 1 ¡ Insert 9 ¡ Insert 5 ¡ It’s a complete tree!

October 2004 John Edgar 57

7 4 9 1 5

height = ⎣log(5)⎦ = 2

slide-58
SLIDE 58

¡ Insert 9 ¡ Insert 1 ¡ Insert 7 ¡ Insert 4 ¡ Insert 5 ¡ It’s a linked list with a lot

  • f extra pointers!

October 2004 John Edgar 58

7 1 9 5 4

height = n – 1 = 4 = O(n)

slide-59
SLIDE 59

¡ It would be ideal if a BST was always

close to complete

§ i.e. balanced ¡ How do we guarantee a balanced BST? § We have to make the insertion and deletion

algorithms more complex

▪ e.g. red – black trees.

October 2004 John Edgar 59

slide-60
SLIDE 60

¡ It is possible to sort an array using a binary

search tree

§ Insert the array items into an empty tree § Write the data from the tree back into the array using an

InOrder traversal

¡ Running time = n*(insertion cost) + traversal

§ Insertion cost is O(h) § Traversal is O(n) § Total = O(n) * O(h) + O(n), i.e. O(n * h) § If the tree is balanced = O(n * log(n))

October 2004 John Edgar 60

slide-61
SLIDE 61
slide-62
SLIDE 62

Tree Quiz I

¡ Write a recursive function to print the

items in a BST in descending order

October 2004 John Edgar 62

class Node { public: int data; Node *leftc; Node *rightc; };

slide-63
SLIDE 63

Tree Quiz II

¡ Write a recursive function to delete a BST

stored in dynamic memory

October 2004 John Edgar 63

class Node { public: int data; Node *leftc; Node *rightc; };

slide-64
SLIDE 64
slide-65
SLIDE 65

Summary

¡ Trees

§ Terminology: paths, height, node relationships, …

¡ Binary search trees

§ Traversal

▪ Post-order, pre-order, in-order

§ Operations

▪ Insert, delete, search

¡ Balanced trees

§ Binary search tree operations are efficient for

balanced trees

October 2004 John Edgar 65

slide-66
SLIDE 66

Readings

¡ Carrano Ch. 10

October 2004 John Edgar 66