Data Structures Balanced Tree Virendra Singh Associate Professor - - PowerPoint PPT Presentation

data structures
SMART_READER_LITE
LIVE PREVIEW

Data Structures Balanced Tree Virendra Singh Associate Professor - - PowerPoint PPT Presentation

Data Structures Balanced Tree Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:


slide-1
SLIDE 1

CADSL

Data Structures Balanced Tree

Virendra Singh

Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay

http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in

EE-717/453:Advance Computing for Electrical Engineers

Lecture 7 (12 Aug 2013)

slide-2
SLIDE 2

CADSL

Binary Search Tree - Analysis

 The running time of these operations is O(d), where d is the depth of the node containing the accessed item.  What is the average depth of the nodes in a binary search tree? It depends on how well balanced the tree is.

12 Aug 2012 EE-717/EE-453@IITB

slide-3
SLIDE 3

CADSL

Average Depth of Nodes

1 5 2 1 8 1 3 3 4

Consider this very well-balanced binary search tree. What is the depth of its leaf nodes? N=7

Data Order: 10, 5, 1, 8, 20, 13, 34

12 Aug 2012 EE-717/EE-453@IITB

slide-4
SLIDE 4

CADSL

A Better Analysis

 The analysis on the previous slide was for a particularly well-balanced binary search

  • tree. However, not all binary search trees

will be this well balanced.  In particular, binary search trees are created via insertions of data. Depending

  • n the order of the data, various trees will

emerge.

12 Aug 2012 EE-717/EE-453@IITB

slide-5
SLIDE 5

CADSL

Effect of Data Order

4 3 2 1 Obtained if data is 4, 3, 2 1 1 2 3 4 Obtained if data is 1, 2, 3, 4

Note in these cases the average depth of nodes is about N/2, not log(N)!

12 Aug 2012 EE-717/EE-453@IITB

slide-6
SLIDE 6

CADSL

Depth of Nodes

 In the best case the depth will be about O(log N).  In the worst case, if the data are already

  • rdered, the depth will be about O(N).

12 Aug 2012 EE-717/EE-453@IITB

slide-7
SLIDE 7

CADSL

Effects of Data Order…

 So, if the input data are randomly ordered, what is the average depth of the nodes?  The analysis is beyond the scope, but it can be shown that the average depth is O(log N), which is a very nice result.

12 Aug 2012 EE-717/EE-453@IITB

slide-8
SLIDE 8

CADSL

Binary Search Tree

 For an average binary search tree, the average depth of the nodes is O(log N). This is quite amazing, indicating that the bad situations, which are O(N), don’t occur very often.  However, for those who are still concerned about the very bad situations, we can try to “balance” the trees.

12 Aug 2012 EE-717/EE-453@IITB

slide-9
SLIDE 9

CADSL

Balancing Trees

 What does it mean to “balance” trees? The basic idea is to make sure that the trees aren’t right-heavy or left-heavy.  When they are right-heavy or left-heavy, the trees need to be adjusted.

12 Aug 2012 EE-717/EE-453@IITB

slide-10
SLIDE 10

CADSL

Example

1 2 3 4 5

Here is a right-heavy tree. How can we adjust it to be more balanced???

6 7

12 Aug 2012 EE-717/EE-453@IITB

slide-11
SLIDE 11

CADSL

Rule #1

1 2 3 4 5 6 7

Rule #1: Require that the left and right subtrees of the root node have the same height. We can do better.

12 Aug 2012 EE-717/EE-453@IITB

slide-12
SLIDE 12

CADSL

Rule #2

3 1 2 4 6 7 5

Rule #2: Require that every node have left and right subtrees of the same height. Too restrictive.

12 Aug 2012 EE-717/EE-453@IITB

slide-13
SLIDE 13

CADSL

Rule #3

3 1 2 4 6 5

Rule #3: Require that, for every node, the height of the left and right subtrees can differ by most one. The example on the left satisfies rule #3, while the one on the right does not. Why? This rule is a nice compromise between too lax and too restrictive.

3 1 2 4 6 Rule 3 violated at node 4

12 Aug 2012 EE-717/EE-453@IITB

slide-14
SLIDE 14

CADSL

Repair

 Suppose the tree violates a balance

  • condition. How and when can it be

repaired?

  • Repair is accomplished via “tree rotations”.
  • Repair is done either during insertions, or after

access of a node (because during access one notices the node is very deep and should be made more shallow).

12 Aug 2012 EE-717/EE-453@IITB

slide-15
SLIDE 15

CADSL

AVL Trees

 AVL (Adelson-Velskii and Landis) trees are binary search trees that follow rule #3. When a tree violates rule #3 a repair is done.

  • The repair is done during insertions, as soon

as rule #3 is violated.

  • The repair is accomplished via “single” and

“double” rotations.

12 Aug 2012 EE-717/EE-453@IITB

slide-16
SLIDE 16

CADSL

Single Rotation

X Y Z

k 2 k 1 h

X Y Z

k 2 k 1

uppose an item is added at the bottom of subtree X, thus causing n imbalance at k2. Then pull k1 up. Note that after the rotation, e height of the tree is the same as it was before the insertion.

New item

12 Aug 2012 EE-717/EE-453@IITB

slide-17
SLIDE 17

CADSL

Example

3 1 2 4 6 3 1 2 4 6

Imbalance at node 4 solved with single rotation.

12 Aug 2012 EE-717/EE-453@IITB

slide-18
SLIDE 18

CADSL

Another Single Rotation

X Y Z

k 2 k 1 h

X Z Y

k 2 k 1

pose an item is added at the bottom of subtree X, thus causing mbalance at k2. Then pull k1 up. Note that after the rotation, height of the tree is the same as it was before the insertion.

12 Aug 2012 EE-717/EE-453@IITB

slide-19
SLIDE 19

CADSL

Another Example

8 5 2 4 6 1 8 5 6 1 2 4

Imbalance at node 4 solved with single rotation.

12 Aug 2012 EE-717/EE-453@IITB

slide-20
SLIDE 20

CADSL

Single Rotations

 After single rotations, the new height of the entire subtree is exactly the same as the height of the

  • riginal subtree prior to the insertion of the new

data item that caused X to grow.  Thus no further updating of heights on the path to the root is needed, and consequently no further rotations are needed.

12 Aug 2012 EE-717/EE-453@IITB

slide-21
SLIDE 21

CADSL

Double Rotation

k 3 h

D

Suppose an item is added below k2. This causes an imbalance at

  • k3. Then pull k2 up. Note that after the rotation, the height of the

ree is the same as it was before the insertion.

k 2 k 1

C B A

k 2 k 1 k 3

C D B A

  • r
  • r

12 Aug 2012 EE-717/EE-453@IITB

slide-22
SLIDE 22

CADSL

Another Double Rotation

k 3 h

D

Suppose an item is added below k2. This causes an imbalance at

  • k3. Then pull k2 up. Note that after the rotation, the height of the

tree is the same as it was before the insertion.

k 2 k 1

C B A

k 2 k 3 k 1

C D B A

  • r
  • r

12 Aug 2012 EE-717/EE-453@IITB

slide-23
SLIDE 23

CADSL

An Example

8 5 2 4 6 1 1 9 9 5 2 4 6 1 1 8

Imbalance at node 8 solved with double rotation.

12 Aug 2012 EE-717/EE-453@IITB

slide-24
SLIDE 24

CADSL

Which Rotation Do I Use?

 Recognizing which rotation you have to use is the hardest part.

  • Find the imbalanced node.
  • Go down two nodes towards the newly inserted

node.

  • If the path is straight, use single rotation.
  • If the path zig-zags, use double rotation.

12 Aug 2012 EE-717/EE-453@IITB

slide-25
SLIDE 25

CADSL

Double Rotation= 2 Single Rotations

k 3 k 2 k 1

C B A D

First do a single rotation of k2 and k1.

12 Aug 2012 EE-717/EE-453@IITB

slide-26
SLIDE 26

CADSL

k 3 k 1 k 2

C B A D

But k3 still imbalanced, so do a single rotation of k2 and k3.

Double Rotation= 2 Single Rotations

12 Aug 2012 EE-717/EE-453@IITB

slide-27
SLIDE 27

CADSL

Double Rotation= 2 Single Rotations

k 2 k 1 k 3

C B A D

Now we are done.

12 Aug 2012 EE-717/EE-453@IITB

slide-28
SLIDE 28

CADSL

Double Rotations

 As with the single rotations, double rotations restore the height of the subtree to what it was before the insertion.  This guarantees that all rebalancing and height updating is complete.

12 Aug 2012 EE-717/EE-453@IITB

slide-29
SLIDE 29

CADSL

Conclusions: AVL Tree

 AVL trees maintain balance of binary search trees while they are being created via insertions of data.  An alternative approach is to have trees that readjust themselves when data is accessed, making often accessed data items move to the top of the tree. We won’t be covering these (splay trees).

12 Aug 2012 EE-717/EE-453@IITB

slide-30
SLIDE 30

CADSL

Red Black Trees

Colored Nodes Definition  Binary search tree.  Each node is colored red or black.  Root and all external nodes are black.  No root-to-external-node path has two consecutive red nodes.  All root-to-external-node paths have the same number of black nodes

12 Aug 2012 EE-717/EE-453@IITB