ECE 242 Data Structures Lecture 22 More Binary Search Trees and - - PDF document

ece 242 data structures
SMART_READER_LITE
LIVE PREVIEW

ECE 242 Data Structures Lecture 22 More Binary Search Trees and - - PDF document

ECE 242 Data Structures Lecture 22 More Binary Search Trees and Hash Tables October 30, 2009 ECE242 L25: More Binary Search Trees and Hash Tables Overview Problem: How do I represent data so that no data value is present more than once?


slide-1
SLIDE 1

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

ECE 242 Data Structures

Lecture 22

More Binary Search Trees and Hash Tables

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Overview °Problem: How do I represent data so that no data value is present more than once? ° Binary Search Tree

  • Insert
  • Search
  • Remove

° Methods to perform operations can be a little complicated

  • We need recursive methods

° Remove operation

  • Follows a standard order of actions
slide-2
SLIDE 2

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method °Remove a node from BST °Three cases

  • Case 1: node is a leaf
  • Case 2: node has one child
  • Case 3: node has two children

10 5 14 1

7

16

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method: Case 1 °The node is a leaf °Assume we want to remove node 7

  • just modify node 7’s parent child pointer

10 5 14 1

7

16

slide-3
SLIDE 3

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method: Case 2 °The node has one child °Assume we want to remove node 14

  • just lift up node 14’s subtree

10 5 14 1

7

16 15

20

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method: Case 3 °The node has two children °Assume we want to remove node 10 10 5 14 4

7

16 15

20

1 3

slide-4
SLIDE 4

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method: Case 3 (cont.) °After removing node 10, there are two subtree for node 10 °We can not simply lift left/right subtree up °But we know, all nodes in left subtree is less than all nodes in right subtree. 10 5 14 4

7

16 15

20

1 3

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Remove Method: Case 3(cont.)

° How to merge two split subtree into one tree? ° Find the maximum node in leftsubtree ° Joint the right subtree into that node

10 5 14 4

7

16 15

20

1 3 10 5 14 4

7

16 15

20

1 3 5 14 4

7

16 15

20

1 3

slide-5
SLIDE 5

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary For Remove Method °Step 1:

  • find the node which needs to be deleted
  • prev is the parent of node to be deleted

10 5 14 4

7

16 15

20

1 3 node prev

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary For Remove Method

° Step 2:

  • find the maximum node tmp of left subtree
  • merge the node’s right subtree to tmp’s right child

10 5 14 4

7

16 15

20

1 3 node prev tmp 10 5 14 4

7

16 15

20

1 3 node prev tmp

slide-6
SLIDE 6

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary For Remove Method

° Step 3:

  • lift node’s left child up

10 5 14 4

7

16 15

20

1 3 node prev 5 14 4

7

16 15

20

1 3 prev node

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary For Remove Method

° Step 4:

  • if we remove the root, then set node as root
  • if we remove prev’s left child, then set node as

prev’s left child

  • if we remove prev’s right child, then set node as

prev’s right child

5 14 4

7

16 15

20

1 3 prev node

slide-7
SLIDE 7

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Code For Remove Method

public void remove(int id) { BSTNode tmp, node, p=root, prev = null; // find the node p which needs to be removed while( p!=null && p.id!=id ) { prev = p; if( p.id<id ) p = p.right; else p = p.left; } node = p;

10 5 14 4

7

16 15

20

1 3 node prev p

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Code For Remove Method(cont.)

if( p!=null && p.id==id ) { // case (1)/(2): node has no right child: // its left child ( if any ) is attached to its parent if( node.right==null ) node = node.left; // case (1)/(2): node has no left child: // its right child ( if any ) is attached to its parent else if( node.left==null ) node = node.right;

slide-8
SLIDE 8

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Code For Remove Method(cont.)

else { // find the maximum node in the left subtree, // store the node to tmp tmp = node.left; while( tmp.right!=null ) tmp = tmp.right; // merge the right subtree to tmp's right child tmp.right = node.right; // lift node.left up node = node.left; }

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Code For Remove Method(cont.)

if( p==root ) root = node; else if( prev.left == p ) prev.left = node; else prev.right = node; } else if( root!=null ) System.out.println( "ID " + id + " is not in the database"); else System.out.println( "The database is empty" );

slide-9
SLIDE 9

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Find/Add Running Time ° How long do the find and add take? ° We do one or two comparisons and go either right or left in the tree -- so running time depends on depth of tree. ° Let’s analyze binary search trees:

  • what is the worst-case depth?
  • what the best-case depth?

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Binary Trees Given a set of n keys, the tree we construct depends on insertion order.

1, 2, 3, 4, 5, 6, 7 4, 2, 1, 3, 6, 5, 7 4 2 1 3 6 5 7 2 3 4 5 6 7 1

slide-10
SLIDE 10

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Bounds on Tree-Depth °The depth of a binary search tree (or any tree), is the maximum number of links between the root and any leaf. °What is the maximum depth of a tree containing n elements? °What is the minimum depth of a tree containing n elements?

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Running Time of Find/Insert °The number of recursive calls depends on the tree- depth. °Each recursive call just makes a constant number

  • f comparisons, so the running time of both is

°What about removing an element?

slide-11
SLIDE 11

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hash Table

°Problem: Is there a more efficient way to add/store values for a set

  • Previously seen ordered list and binary tree

°Hash Table

  • Involves a mathematical function called a “hash function”
  • Collision in storing/accessing data
  • Complexity

°Can be somewhat difficult to implement

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Motivation For Hash Table ° We have to store some records and perform the following:

  • add new record
  • delete record
  • search a record by key

° Find a way to do these efficiently!

What are some of the techniques we have seen so far?

slide-12
SLIDE 12

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Unsorted Array

° Use an array to store the records, in unsorted order

  • add
  • add the records as the last entry fast, O(1)
  • delete a target
  • slow to delete a record because we need to find the

target, O(n)

  • search
  • sequential search slow, O(n)

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Sorted Array

° Use an array to store the records, keeping them in sorted

  • rder
  • add
  • insert the record in proper position. much record

movement slow O(n)

  • delete a target
  • how to handle the hole after deletion? Much record

movement slow O(n)

  • search
  • binary search fast O(log n)
slide-13
SLIDE 13

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Linked List

° Store the records in a linked list (sorted / unsorted)

  • add
  • fast if one can insert node anywhere O(1)
  • delete a target
  • fast at disposing the node, but slow at finding the

target O(n)

  • search
  • sequential search slow O(n)

(if we only use linked list, we cannot use binary search even if the list is sorted.)

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Tree ° Better performance but are more complex ° Tree

  • What sort of complexity can we expect from BST?
  • What types of data don’t support a BST?
slide-14
SLIDE 14

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Array As Table 9903030 9802020 9801010 0056789 0012345 0033333 tom mary peter david andy betty 73 100 20 56.8 81.5 90 ID NAME SCORE 9908080 bill 49 ... ... Consider this problem. We want to store 1000 student records and search them by student ID.

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Array As Table (Cont.) : betty : andy : : 90 : 81.5 :

ID NAME SCORE

david 56.8 : bill : : : 49 : :

One ‘ID’ way is to store the records in a huge array (index 0…9999999). The index is used as the student id, i.e. the record of the student with ID 0012345 is stored at A[12345]

: 33333 : 12345 : 56789 : 9908080 : : 9999999

slide-15
SLIDE 15

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Array As Table --- Not Good ° Store the records in a huge array where the index corresponds to the key

  • add - very fast O(1)
  • delete - very fast O(1)
  • search - very fast O(1)

° But it wastes a lot of memory! Not feasible.

We need to find a technique that efficiently uses available memory

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

New Function For Key

int Hash(key) ---- return an integer value

Imagine that we have such a magic function Hash. It maps the key (ID)

  • f the 1000 records into the integers

0…999, one to one. No two different keys maps to the same number.

H(‘0012345’) = 134 H(‘0033333’) = 67 H(‘0056789’) = 764 … H(‘9908080’) = 3

slide-16
SLIDE 16

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hash Table

  • ID

NAME SCORE

  • 3

67 764 999 134

To store a record, we compute Hash(ID) for the record and store it at the location Hash(ID) of the array. To search for a student, we only need to peek at the location Hash(target ID).

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hash table with Perfect Hash ° Such magic function is called perfect hash

  • add - very fast O(1)
  • delete - very fast O(1)
  • search - very fast O(1)

° But it is generally difficult to design perfect hash. (e.g. when the potential key space is large)

slide-17
SLIDE 17

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hash Function ° A hash function maps a key to an index within in a range ° Desirable properties:

  • simple and quick to calculate
  • even distribution, avoid collision as much as possible

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Collision ° For most cases, we cannot avoid collision ° how to handle when two different keys map to the same index? H(‘0012345’) = 134 H(‘0033333’) = 67 H(‘0056789’) = 764 … H(‘9903030’) = 3 H(‘9908080’) = 3

slide-18
SLIDE 18

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hashing using Linked Structures

2 4 1 3

null null null

5

null :

HASHMAX Key: 9903030 name: tom score: 73 One way to handle collision is to store the collided records in a linked list. The array now stores pointers to such lists. If no key maps to a certain hash value, that array entry points to null.

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Chained Hash Table ° Hash table, where collided records are stored in linked list

  • good hash function, appropriate hash size
  • Few collisions. Add, delete, search very fast O(1)
  • otherwise…
  • some hash value has a long list of collided records..
  • add - just insert at the head fast O(1)
  • delete a target - delete from unsorted linked list slow
  • search - sequential search slow O(n)
slide-19
SLIDE 19

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Hashing using arrays ° Linear probing

  • If there is a collision during insertion, we simply move on to the

next unoccupied position.

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary °Recursive methods used for search, insert, and remove °Visualize operations using pictures of trees °Be familiar with code associated with this and the previous lecture °BSTs used widely in commercial coding

slide-20
SLIDE 20

ECE242 L25: More Binary Search Trees and Hash Tables October 30, 2009

Summary °Hash tables allow rapid access to data when indices are likely to be sparse °Textbook includes a number of complicated examples

  • Only responsible for chaining technique

°Development of good hash functions is difficult

  • Attempt to reduce collisions

°Very useful for databases °Consider code needed to access data values in a hash function