Supervised Learning: The Setup Machine Learning 1 Last lecture We - PowerPoint PPT Presentation

Supervised Learning: The Setup Machine Learning 1

Last lecture We saw – What is learning? Learning as generalization – The badges game 2

This lecture • More badges • Formalizing supervised learning – Instance space and features What are inputs to the learning problem? – Label space What is the output of the learned function – Hypothesis space What is being learned? 3 Some slides based on lectures from Tom Dietterich, Dan Roth

The badges game 4

Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - (Full data on the class website, you can stare at it longer if you want) 5

Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - What is the label for Indiana Jones ? (Full data on the class website, you can stare at it longer if you want) 6

Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? (Full data on the class website, you can stare at it longer if you want) 7

Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? If last letter of first name is before last letter of last name: label = + else label = - (Full data on the class website, you can stare at it longer if you want) 8

Questions to think about How could you be certain that you got the right function? How did you arrive at it? • Learning issues: Is this prediction or just modeling data? Is there a difference? • How did you know that you should look at the letters? • What background knowledge about letters did you use? How • did you know that it is relevant? What “learning algorithm” did you use? • 9

What is supervised learning? 10

Instances and Labels Running example: Automatically tag news articles 11

Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 12

Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 13

Instances and Labels Running example: Automatically tag news articles Instance Space : All possible Label Space : All possible labels news articles 14

Instances and Labels 𝒴 : Instance Space The set of examples that need to be classified Eg: The set of all possible names, documents, sentences, images, emails, etc 15

Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space The set of examples The set of all that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 16

Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 17

Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified The goal of learning: Find this target function Learning is search over functions Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 18

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 19

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + 𝑦 , , 𝑔(𝑦 , ) ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 20

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 21

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 22

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) This is the training phase. Labeled training data 23

Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Can you think of other training protocols? Labeled training data 24

Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 25

Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) 26

Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) Apply the model to many test examples and compare to the target’s prediction Aggregate these results to get a quality measure 27

Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) Apply the model to many test examples and compare to the target’s prediction Can we use these test examples during the training phase? 28

Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) 29

Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) The function 𝑔 is unknown 30

Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features 31

Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label 32

Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label The goal of learning : Use the training examples to find a good approximation for 𝑔 33

Supervised Learning: The Setup Machine Learning 1 Last lecture We - PowerPoint PPT Presentation

Supervised Learning: The Setup Machine Learning 1 Last lecture We saw What is learning? Learning as generalization The badges game 2 This lecture More badges Formalizing supervised learning Instance space and features

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Machine Learning for NLP Supervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

PVMD Delft University of Technology Learning objectives Typical JV testing setup Learning

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

Supervised Learning Prof. Kuan-Ting Lai 2020/4/9 Machine Learning Supervised Unsupervised

Scintillators: Setup, performance and lessons learned Ran Hong CENPA, University of Washington

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Stacking for supervised learning Stacking for supervised learning Niall Rooney, NIKEL,

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Learning frameworks Self-supervised learning: (Auto)encoder networks Supervised learning Network

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Web Mining and Recommender Systems Supervised learning Regression Learning Goals Introduce

Chapter 14.2: Terence Roman Comedy after Plautus transitional figure between Plautus and

CS1100: Computer Science and Its Applications Text Processing Processing Text Excel can be

What's In A Name? Huizhong Chen, Andrew C. Gallagher, Bernd Girod Outline Extra background

Relational Databases Week 13 LBSC 671 Creating Information Infrastructures Databases

Computable analysis, exact real arithmetic and analytic functions in Coq Holger Thies, Kyushu

Using a WCET Analysis Tool in Real-Time Systems Education Samuel Petersson, Andreas Ermedahl,

Adarules: Learning rules for real-time road-traffic prediction Rafael Mena-Yedra 1,2 Ricard

ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time Rudra PK Poudel

Supervised Learning: The Setup Machine Learning 1 Last lecture We - PowerPoint PPT Presentation

Supervised Learning: The Setup Machine Learning 1 Last lecture We saw What is learning? Learning as generalization The badges game 2 This lecture More badges Formalizing supervised learning Instance space and features

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Machine Learning for NLP Supervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

PVMD Delft University of Technology Learning objectives Typical JV testing setup Learning

Introduction to Scikit-Learn: Machine Learning with Introduction to Scikit-Learn: Machine Learning

Supervised Learning Prof. Kuan-Ting Lai 2020/4/9 Machine Learning Supervised Unsupervised

Scintillators: Setup, performance and lessons learned Ran Hong CENPA, University of Washington

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Stacking for supervised learning Stacking for supervised learning Niall Rooney, NIKEL,

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Learning frameworks Self-supervised learning: (Auto)encoder networks Supervised learning Network

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Web Mining and Recommender Systems Supervised learning Regression Learning Goals Introduce

Chapter 14.2: Terence Roman Comedy after Plautus transitional figure between Plautus and

CS1100: Computer Science and Its Applications Text Processing Processing Text Excel can be

What's In A Name? Huizhong Chen, Andrew C. Gallagher, Bernd Girod Outline Extra background

Relational Databases Week 13 LBSC 671 Creating Information Infrastructures Databases

Computable analysis, exact real arithmetic and analytic functions in Coq Holger Thies, Kyushu

Using a WCET Analysis Tool in Real-Time Systems Education Samuel Petersson*, Andreas Ermedahl*,

Adarules: Learning rules for real-time road-traffic prediction Rafael Mena-Yedra 1,2 Ricard

ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time Rudra PK Poudel

Using a WCET Analysis Tool in Real-Time Systems Education Samuel Petersson, Andreas Ermedahl,