Lecture 11: Introduction CSE 373: Data Structures and to Hash - PowerPoint PPT Presentation

Lecture 11: Introduction CSE 373: Data Structures and to Hash Tables Algorithms CSE 373 SU 19 - ROBBIE WEBER 1

Administrivia When you’re submitting your group writeup to gradescope, be sure to use the group submission option if you have a partner. Project 1 part 2 due Thursday night. Exercise 2 due Friday night. Project 2 will come out tonight, and Exercise 3 will come out Friday. Due in two weeks (Wednesday the 31 st for Project 2, and Friday the 2 nd for Exercise 3) They “should” be one week assignments… but next Friday is the midterm! We’re leaving it to you to decide how/when to study for the midterm vs. doing homework. CSE 373 SU 19 - ROBBIE WEBER 2

Aside: How Fast is Θ(log &) ? If you just looked at a list of common running times Cl Class Big Big O If If you u doub uble N… Ex Example algorithm constant O(1) unchanged Add to front of linked list logarithmic O(log n) Increases slightly Binary search linear O(n) doubles Sequential search “n log n” O(nlog n) Slightly more than Merge sort doubles quadratic O(n 2 ) quadruples Nested loops traversing a 2D array You might think this was a small improvement. It was a HUGE improvement! CSE 373 SU 19 - ROBBIE WEBER

Logarithmic vs. Linear If you double the size of the input, - A linear time algorithm takes twice as long. - A logarithmic time algorithm has a constant additive increase to its running time. To make a logarithmic time algorithm take twice as long, how much do you have to increase ! by? pollEV.com/cse373su19 How do you increase !? You have to square it log(! & ) = 2 log(!) . A gigabyte worth of integer keys can fit in an AVL tree of height 60. It takes a ridiculously large input to make a logarithmic time algorithm go slowly. Log isn’t “that running time between linear and constant” it’s “that running time that’s barely worse than a constant.” CSE 373 SU 19 - ROBBIE WEBER

Logarithmic Running Times This identity is so important, one of my friends made me a cross-stitch of it. Two lessons: 1. Log running times are REALLY REALLY FAST. !(log & ' ) is not 2. simplified, it’s just !(log &) CSE 373 SU 19 - ROBBIE WEBER

Aside: Traversals What if the heights of subtrees were corrupted. How could we calculate from scratch? We could use a “traversal” - A process that visits every piece of data in a data structure. int height(Node curr){ if(curr==null) return -1; int h = Math.max(height(curr.left),height(curr.right)); return h+1; } CSE 373 SU 19 - ROBBIE WEBER

Three Kinds of Traversals InOrder(Node curr){ PreOrder(Node curr){ InOrder(curr.left); doSomething(curr); doSomething(curr); PreOrder(curr.left); InOrder(curr.right); PreOrder(curr.right); } } PostOrder(Node curr){ PostOrder(curr.left); PostOrder(curr.right); doSomething(curr); } CSE 373 SU 19 - ROBBIE WEBER

Traversal Practice For each of the following scenarios, choose an appropriate traversal: 1. Print out all the keys in an AVL-Dictionary in sorted order. 2. Make a copy of an AVL tree 3. Determine if an AVL tree is balanced (assume height values are not stored) CSE 373 SU 19 - ROBBIE WEBER

Traversal Practice For each of the following scenarios, choose an appropriate traversal: 1. Print out all the keys in an AVL-Dictionary in sorted order. In order 2. Make a copy of an AVL tree Pre-order 3. Determine if an AVL tree is balanced (assume height values are not stored) Post-order CSE 373 SU 19 - ROBBIE WEBER

Traversals If we have ! elements, how long does it take to calculate height? Θ(!) time. The recursion tree (from the tree method) IS the AVL tree! We do a constant number of operations at each node In general, traversals take Θ ! ⋅ &(') time, where doSomething() takes Θ & ' time. Common question on technical interviews! CSE 373 SU 19 - ROBBIE WEBER

Aside: Other Self-Balancing Trees There are lots of flavors of self-balancing search trees “Red-black trees” work on a similar principle to AVL trees. “Splay trees” -Get !(log &) amortized bounds for all operations. “Scapegoat trees” “Treaps” – a BST and heap in one (!) B-trees (see other 373 versions) optimized for huge datasets. If you have an application where you need a balanced BST that also [does something] it might already exist. Google first, you might be able to use a library. CSE 373 SU 19 - ROBBIE WEBER

Hashing CSE 373 SU 19 - ROBBIE WEBER 12

Review: Dictionaries Dictionary ADT ArrayDictionary<K, V> LinkedDictionary<K, V> AVLDictionary<K, V> st state state state state Pair<K, V>[] data front overallRoot Set of items & keys size size Count of items behavior behavior behavior behavi be vior put create new pair, add put if key is unused, put if key is unused, put(key, item) add item to to next available spot, create new pair, add to create new pair, place in collection indexed with key grow array if necessary front of list, else BST order, rotate to get(key) return item get scan all pairs replace with new value maintain balance associated with key looking for given key, get scan all pairs get traverse through tree containsKey(key) return if key return associated item if looking for given key, using BST property, already in use found return associated item if return item if found remove(key) remove item containsKey scan all found containsKey traverse and associated key pairs, return if key is containsKey scan all through tree using BST size() return count of items found pairs, return if key is property, return if key remove scan all pairs, found is found replace pair to be remove scan all pairs, remove traverse through removed with last pair in skip pair to be removed tree using BST property, collection size return count of replace or nullify as size return count of items in dictionary appropriate items in dictionary size return count of items in dictionary CSE 373 SU 19 - ROBBIE WEBER 13

3 Minutes Review: Dictionaries Why are we so obsessed with Dictionaries? It’s all about data baby! Dictionary ADT When dealing with data: st state SUPER common in comp sci • Adding data to your collection Set of items & keys - Databases Count of items • Getting data out of your collection - Network router tables be behavi vior • Rearranging data in your collection - Compilers and Interpreters put(key, item) add item to collection indexed with key Operation ArrayList LinkedList BST AVLTree get(key) return item associated with key containsKey(key) return if key best already in use put(key,value) remove(key) remove item worst and associated key size() return count of items best get(key) worst best remove(key) worst CSE 373 SU 19 - ROBBIE WEBER 14

3 Minutes Review: Dictionaries Why are we so obsessed with Dictionaries? It’s all about data baby! Dictionary ADT When dealing with data: state st SUPER common in comp sci • Adding data to your collection Set of items & keys - Databases Count of items • Getting data out of your collection - Network router tables be behavi vior • Rearranging data in your collection - Compilers and Interpreters put(key, item) add item to collection indexed with key Operation ArrayList LinkedList BST AVLTree get(key) return item associated with key Θ (1) Θ (1) Θ (1) Θ (1) containsKey(key) return if key best already in use put(key,value) remove(key) remove item Θ (n) Θ (n) Θ (n) Θ (logn) worst and associated key size() return count of items Θ (1) Θ (1) Θ (1) Θ (1) best get(key) Θ (n) Θ (n) Θ (n) Θ (logn) worst Θ (1) Θ (1) Θ (1) Θ (logn) best remove(key) Θ (n) Θ (n) Θ (n) Θ (logn) worst CSE 373 SU 19 - ROBBIE WEBER 15

“In-Practice” Case For Hash Tables, we’re going to talk about what you can expect “in-practice” - Instead of just what the best and worst scenarios are. Other resources (and previous versions of 373) use “average case” There’s a lot of math (beyond the scope of the course) needed to make “average” statements precise. - So we’re not going to do it that way. For this class, we’ll just tell you what assumptions we’re making about how the “real world” usually works. And then do worst-case analysis under those assumptions. CSE 373 SU 19 - ROBBIE WEBER

Can we do better? What if we knew exactly where to find our data? DirectAccessMap<Integer, V> Implement a dictionary that accepts only integer keys state between 0 and some value k Data[] size - -> Leverage Array Indices! behavior put put item at given index “Direct address map” get get item at given index containsKey if data[] null at index, return false, return Operation Array w/ indices as keys true otherwise remove nullify element at index best O(1) size return count of items in put(key,value) dictionary worst O(1) best O(1) get(key) worst O(1) best O(1) remove(key) worst O(1) CSE 373 SU 19 - ROBBIE WEBER 17

Implement Direct Access Map public V get(int key) { DirectAccessMap<Integer, V> this.ensureIndexNotNull(key); state return this.array[key]; Data[] size } behavior put put item at given index get get item at given index public void put(int key, V value) { containsKey if data[] null at index, return false, return true this.array[key] = value; otherwise } remove nullify element at index size return count of items in dictionary public void remove(int key) { this.entureIndexNotNull(key); this.array[key] = null; } CSE 373 SU 19 - ROBBIE WEBER 18

Lecture 11: Introduction CSE 373: Data Structures and to Hash - PowerPoint PPT Presentation

Lecture 11: Introduction CSE 373: Data Structures and to Hash Tables Algorithms CSE 373 SU 19 - ROBBIE WEBER 1 Administrivia When youre submitting your group writeup to gradescope, be sure to use the group submission option if you have a

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Repetition Automatic Control, Basic Course, Lecture 11 Fredrik Bagge Carlson December 17, 2016

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

A toy example in Minimal Model Program In minimal model program for 3-folds, Mori connected

Modding the OSM Data Model Jochen Topf Modding the OSM Data Model Jochen Topf What we will talk

James Pearce Director, Developer Relations @ jamespearce jamesp@sencha.com HTML5 and the dawn

Measurement @ Google Michael Piatek piatek@google.com March 27, 2014 My background PhD from

rs tr s

Groups Actions on Deformation Quantization Niek de Kleijn August 24, 2014 1 Introduction In

System Architectures Game Engine Architecture: Basics and History Jonathan Thaler Department of

sl 3 web algebras and categorified Howe duality Marco Mackaay (joint with Weiwei Pan and Daniel