Learning From Data Lecture 17 Memory and Efficiency in Nearest - PowerPoint PPT Presentation

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100

recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x − x ′ | d ( x , x ′ ) = | | 2. No training. 1-NN rule 3. Near optimal E out : ⇒ E out → E ∗ k → ∞ , k/N → 0 = out . 4. Good ways to choose k : � √ � k = 3; k = N ; validation/cross validation. 5. Easy to justify classification to customer. 6. Can easily do multi-class. 21-NN rule 7. Can easily adapt to regression or logistic regression k k g ( x ) = 1 g ( x ) = 1 � � � � y [ i ] ( x ) y [ i ] ( x ) = +1 k k i =1 i =1 8. Computationally demanding . M Memory and Efficiency in Nearest Neighbor : 2 /25 � A c L Creator: Malik Magdon-Ismail Computational demands − →

Computational Demands of Nearest Neighbor Memory. Need to store all the data, O ( Nd ) memory. N = 10 6 , d = 100, double precision ≈ 1GB Finding the nearest neighbor of a test point. Need to compute distance to every data point, O ( Nd ). N = 10 6 , d = 100, 3GHz processor ≈ 3ms (compute g ( x )) N = 10 6 , d = 100, 3GHz processor ≈ 1hr (compute CV error) N = 10 6 , d = 100, 3GHz processor > 1month (choose best k from among 1000 using CV) M Memory and Efficiency in Nearest Neighbor : 3 /25 � A c L Creator: Malik Magdon-Ismail Two basic approaches − →

Two Basic Approaches Reduce the amount of data. The 5-year old does not remember every horse he has seen, only a few representative horses. Store the data in a specialized data structure. Ongoing research field to develop geometric data structures to make finding nearest neighbors fast. M Memory and Efficiency in Nearest Neighbor : 4 /25 � A c L Creator: Malik Magdon-Ismail Irrelevant data − →

Throw Away Irrelevant Data − − − − − − → − − − → k = 1 M Memory and Efficiency in Nearest Neighbor : 5 /25 � A c L Creator: Malik Magdon-Ismail Decision boundary consistent − →

Decision Boundary Consistent − − − − − − → − − − → g ( x ) unchanged M Memory and Efficiency in Nearest Neighbor : 6 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent − →

Training Set Consistent − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 7 /25 � A c L Creator: Malik Magdon-Ismail Comparing − →

� Decision Boundary Vs. Training Set Consistent DB − − − − − − → − − TS − → g ( x ) unchanged versus g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 8 /25 � A c L Creator: Malik Magdon-Ismail Consistent = ⇒ ( g ( x n ) = y n ) − →

Consistent Does Not Mean g ( x n ) = y n DB − − − − − − → − − TS − → k = 3 M Memory and Efficiency in Nearest Neighbor : 9 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent ( k = 3) − →

Training Set Consistent ( k = 3 ) − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 10 /25 � A c L Creator: Malik Magdon-Ismail CNN − →

CNN: Condensed Nearest Neighbor ( k = 3 ) + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 11 /25 � A c L Creator: Malik Magdon-Ismail CNN: add red point − →

CNN: Condensed Nearest Neighbor + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 12 /25 � A c L Creator: Malik Magdon-Ismail CNN: algorithm − →

CNN: Condensed Nearest Neighbor + add this point + + 1. Randomly select k data points into S . + 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . 5. Iterate until S classifies all points consistently with D . Consider the solid blue point: i. blue w.r.t. selected points ii. red w.r.t. D Minimum consistent set (MCS)? ← NP-hard Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 13 /25 � A c L Creator: Malik Magdon-Ismail Digits Data − →

Nearest Neighbor on Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 14 /25 � A c L Creator: Malik Magdon-Ismail Condensing the Digits Data − →

Condensing the Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 15 /25 � A c L Creator: Malik Magdon-Ismail Finding the nearest neighbor − →

Finding the Nearest Neighbor 1. S 1 , S 2 are ‘clusters’ with centers µ 1 , µ 2 and radii r 1 , r 2 . 2. [Branch] Search S 1 first → ˆ x [1] . S 2 3. The distance from x to any point in S 2 is at least | | x − µ 2 | | − r 2 4. [Bound] So we are done if | | x − ˆ x [1] | | ≤ | | x − µ 2 | | − r 2 x S 1 A branch and bound algorithm Can be applied recursively M Memory and Efficiency in Nearest Neighbor : 16 /25 � A c L Creator: Malik Magdon-Ismail When does the bound hold? − →

When Does the Bound Hold? | x − ˆ Bound condition: | x [1] | | ≤ | | x − µ 2 | | − r 2 . | x − ˆ | x [1] | | ≤ | | x − µ 1 | | + r 1 S 2 So, it suffices that r 1 + r 2 ≤ | | x − µ 2 | | − | | x − µ 1 | | . | | x − µ 1 | | ≈ 0 means | | x − µ 2 | | ≈ | | µ 2 − µ 2 | | . x It suffices that S 1 r 1 + r 2 ≤ | | µ 2 − µ 1 | | . within cluster spread should be less than between cluster spread M Memory and Efficiency in Nearest Neighbor : 17 /25 � A c L Creator: Malik Magdon-Ismail Finding clusters – Lloyd’s algorithm − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 18 /25 � A c L Creator: Malik Magdon-Ismail Furtherest away point − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 19 /25 � A c L Creator: Malik Magdon-Ismail Next furtherest away point − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 20 /25 � A c L Creator: Malik Magdon-Ismail All centers picked − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 21 /25 � A c L Creator: Malik Magdon-Ismail Construct Voronoi regions − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 22 /25 � A c L Creator: Malik Magdon-Ismail Update centers − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 23 /25 � A c L Creator: Malik Magdon-Ismail Update Voronoi regions − →

Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 24 /25 � A c L Creator: Malik Magdon-Ismail Preview RBF − →

Learning From Data Lecture 17 Memory and Efficiency in Nearest - PowerPoint PPT Presentation

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100 recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x x | d ( x , x ) = | | 2. No training.

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

Data, memory, pointer Pointers and arrays 1-1 Data, memory memory address: every byte

ECON 4100: Industrial Organization Lecture 2- Efficiency 1 Overview Efficiency and markets

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Memory Management Ideally programmers want memory that is large fast non

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Proposal to add DUNE to the OSG Council Ken Herner for the DUNE Collaboration CHEP 2019 13 Dec

PhDnet Annual Report Leonard Borchert 2017 PhDnet Spokesperson The Max Planck Society 83

Preface About speaker and content Industry/Experience report on Recent Trends in Cyber Economy

Montana Commission on Sentencing Supervision November 17

The quest of efficiency and certification in polynomial optimization Victor Magron , CNRSLAAS

SN trigger requirement changes and latency Pierre Lasorak & Simon Peeters 1 Outline SN

Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation tiny Neural Networks big

Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an

Learning From Data Lecture 17 Memory and Efficiency in Nearest - PowerPoint PPT Presentation

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100 recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x x | d ( x , x ) = | | 2. No training.

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

Data, memory, pointer Pointers and arrays 1-1 Data, memory memory address: every byte

ECON 4100: Industrial Organization Lecture 2- Efficiency 1 Overview Efficiency and markets

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Memory Management Ideally programmers want memory that is large fast non

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Proposal to add DUNE to the OSG Council Ken Herner for the DUNE Collaboration CHEP 2019 13 Dec

PhDnet Annual Report Leonard Borchert 2017 PhDnet Spokesperson The Max Planck Society 83

Preface About speaker and content Industry/Experience report on Recent Trends in Cyber Economy

Montana Commission on Sentencing Supervision November 17

The quest of efficiency and certification in polynomial optimization Victor Magron , CNRSLAAS

SN trigger requirement changes and latency Pierre Lasorak &amp; Simon Peeters 1 Outline SN

Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation tiny Neural Networks big

Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an

SN trigger requirement changes and latency Pierre Lasorak & Simon Peeters 1 Outline SN