pac learning
play

PAC Learning Matt Gormley Lecture 14 Oct. 17, 2018 1 ML Big - PowerPoint PPT Presentation

10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University PAC Learning Matt Gormley Lecture 14 Oct. 17, 2018 1 ML Big Picture Learning Paradigms: Problem Formulation:


  1. 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University PAC Learning Matt Gormley Lecture 14 Oct. 17, 2018 1

  2. ML Big Picture Learning Paradigms: Problem Formulation: Vision, Robotics, Medicine, What is the structure of our output prediction? What data is available and NLP, Speech, Computer when? What form of prediction? boolean Binary Classification • supervised learning categorical Multiclass Classification • unsupervised learning ordinal Ordinal Classification Application Areas • semi-supervised learning • real Regression reinforcement learning Key challenges? • active learning ordering Ranking • imitation learning multiple discrete Structured Prediction • domain adaptation • multiple continuous (e.g. dynamical systems) online learning Search • density estimation both discrete & (e.g. mixed graphical models) • recommender systems cont. • feature learning • manifold learning • dimensionality reduction Facets of Building ML Big Ideas in ML: • ensemble learning Systems: Which are the ideas driving • distant supervision How to build systems that are development of the field? • hyperparameter optimization robust, efficient, adaptive, • inductive bias effective? Theoretical Foundations: • generalization / overfitting 1. Data prep • bias-variance decomposition What principles guide learning? 2. Model selection • 3. Training (optimization / generative vs. discriminative q probabilistic search) • deep nets, graphical models q information theoretic 4. Hyperparameter tuning on • PAC learning q evolutionary search validation data • distant rewards 5. (Blind) Assessment on test q ML as optimization data 2

  3. LEARNING THEORY 3

  4. Questions For Today 1. Given a classifier with zero training error, what can we say about generalization error? (Sample Complexity, Realizable Case) 2. Given a classifier with low training error, what can we say about generalization error? (Sample Complexity, Agnostic Case) 3. Is there a theoretical justification for regularization to avoid overfitting? (Structural Risk Minimization) 4

  5. PAC/SLT models for Supervised Learning PAC / SLT Model Data Distribution D on X Source Expert / Oracle Learning Algorithm Labeled Examples (x 1 ,c*(x 1 )),…, ( x m ,c*(x m )) c* : X ! Y Alg.outputs h : X ! Y x 1 > 5 + + - + - + +1 x 6 > 2 - - - - -1 +1 6 Slide from Nina Balcan

  6. Two Types of Error True Error (aka. expected risk ) Train Error (aka. empirical risk ) 7

  7. PAC / SLT Model 8

  8. Three Hypotheses of Interest 9

  9. PAC LEARNING 10

  10. Probably Approximately Correct (PAC) Learning Whiteboard: – PAC Criterion – Meaning of “Probably Approximately Correct” – PAC Learnable – Consistent Learner – Sample Complexity 11

  11. Generalization and Overfitting Whiteboard: – Realizable vs. Agnostic Cases – Finite vs. Infinite Hypothesis Spaces 12

  12. PAC Learning 13

  13. SAMPLE COMPLEXITY RESULTS 14

  14. Sample Complexity Results We’ll start with the Four Cases we care about… finite case… Realizable Agnostic 15

  15. Sample Complexity Results Four Cases we care about… Realizable Agnostic 16

  16. Example: Conjunctions In-Class Quiz: Suppose H = class of conjunctions over x in {0,1} M If M = 10, ! = 0.1, δ = 0.01, how many examples suffice? Realizable Agnostic 17

  17. Learning Theory Objectives You should be able to… • Identify the properties of a learning setting and assumptions required to ensure low generalization error • Distinguish true error, train error, test error • Define PAC and explain what it means to be approximately correct and what occurs with high probability • Apply sample complexity bounds to real-world learning examples • Distinguish between a large sample and a finite sample analysis • Theoretically motivate regularization 43

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend