Training Strategies CS 6355: Structured Prediction 1 So far we saw - PowerPoint PPT Presentation

Training Strategies CS 6355: Structured Prediction 1

So far we saw • What is structured output prediction? • Different ways for modeling structured prediction – Conditional random fields, factor graphs, constraints • What we only occasionally touched upon: – Algorithms for training and inference • Viterbi (inference in sequences) • Structured perceptron (training in general) 2

Rest of the semester • Strategies for training – Structural SVM – Stochastic gradient descent – More on local vs. global training • Algorithms for inference – Exact inference – “Approximate” inference – Formulating inference problems in general • Latent/hidden variables, representations and such 3

Up next • Structural Support Vector Machine – How it naturally extends multiclass SVM • Empirical Risk Minimization – Or: how structural SVM and CRF are solving very similar problems • Training Structural SVM via stochastic gradient descent – And some tricks 4

Where are we? • Structural Support Vector Machine – How it naturally extends multiclass SVM • Empirical Risk Minimization – Or: how structural SVM and CRF are solving very similar problems • Training Structural SVM via stochastic gradient descent – And some tricks 5

Recall: Binary and Multiclass SVM • Binary SVM – Maximize margin – Equivalently, Minimize norm of weights such that the closest points to the hyperplane have a score ± 1 • Multiclass SVM – Each label has a different weight vector (like one-vs-all) – Maximize multiclass margin – Equivalently, Minimize total norm of the weights such that the true label is scored at least 1 more than the second best one 6

Multiclass SVM in the separable case We have a data set D = {< x i , y i >} Recall hard binary SVM 7

Multiclass SVM in the separable case Size of the weights. We have a data set D = {< x i , y i >} Recall hard binary SVM Effectively, regularizer The score for the true label is higher than the score for any other label by 1 8

� Structural SVM: First attempt Suppose we have some definition of a structure (a factor graph) – And feature definitions for each “part” 𝑞 as Φ 𝑞 (𝐲, 𝐳 𝑞 ) – Remember: we can talk about the feature vector for the entire structure • Features sum over the parts Φ 𝐲, 𝐳 = ) Φ * 𝐲, 𝐳 * *∈-./01 𝐲 9

� Structural SVM: First attempt Suppose we have some definition of a structure (a factor graph) – And feature definitions for each “part” 𝑞 as Φ 𝑞 (𝐲, 𝐳 𝑞 ) – Remember: we can talk about the feature vector for the entire structure • Features sum over the parts Φ 𝐲, 𝐳 = ) Φ * 𝐲, 𝐳 * *∈-./01 𝐲 We also have a data set 𝐸 = {(𝐲 4 , 𝐳 4 )} 10

� Structural SVM: First attempt Suppose we have some definition of a structure (a factor graph) – And feature definitions for each “part” 𝑞 as Φ 𝑞 (𝐲, 𝐳 𝑞 ) – Remember: we can talk about the feature vector for the entire structure • Features sum over the parts Φ 𝐲, 𝐳 = ) Φ 𝑞 𝐲, 𝐳 𝑞 𝑞∈parts 𝐲 We also have a data set 𝐸 = {(𝐲 𝑗 , 𝐳 𝑗 )} What we want from training (following the multiclass idea) For each training example (𝐲 4 , 𝐳 4 ) : The annotated structure 𝐳 4 gets the highest score among all structures – Or to be safe, 𝐳 4 gets a score that is at least one more than all other structures – 𝐱 ? Φ 𝐲 4 , 𝐳 4 ≥ 𝐱 ? Φ 𝐲 4 , 𝐳 + 1 ∀𝐳 ≠ 𝐳 4 , 11

Structural SVM: First attempt Maximize margin For every Score for gold Score for other training example structure structure Some other structure 16

Structural SVM: First attempt Maximize margin Input with gold Some other Score for gold Score for other structure structure 17

Structural SVM: First attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure 18

Structural SVM: First attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure Problem? 19

Structural SVM: First attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure Problem Gold structure 20

Structural SVM: First attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure Problem Gold structure Other structure A: Only one mistake Other structure B: Fully incorrect 21

Structural SVM: First attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure Problem Gold structure Other structure A: Only one mistake Structure B has is more wrong, but this formulation will be happy if both A & B are scored one less than gold! Other structure B: Fully incorrect No partial credit! 22

Structural SVM: Second attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure 23

Structural SVM: Second attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure Hamming distance between structures: Counts the number of differences between them 24

Structural SVM: Second attempt Maximize margin by minimizing norm of w Input with gold Some other Score for gold Score for other structure structure 25

Structural SVM: Second attempt Intuition • It is okay for a structure that is close (in Hamming sense) to the true one to get a score that is close to the true structure • Structures that are very different from the true structure should get much lower scores 26

Structural SVM: Second attempt Maximize margin by minimizing norm of w Intuition • It is okay for a structure that is close (in Hamming sense) to the true one to get a score that is close to the true structure • Structures that are very different from the true structure should get much lower scores 27

Training Strategies CS 6355: Structured Prediction 1 So far we saw - PowerPoint PPT Presentation

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output prediction? Different ways for modeling structured prediction Conditional random fields, factor graphs, constraints What we only

Compliance Training 2012 Compliance Training 2012 Training Objectives Training Objectives

Uninformed Search strategies AIMA sections 3.4 Uninformed search strategies Uninformed Search

New Staff Training Training Site Development Training Site Development 2 Training Site

Product Features Technical Training 2007 Technical Training 2007 Technical Training 2007

Food Handler Training Food Handler Training Food Handler Training Food Handler Training Online

Service Section Service Section Technical Training Technical Training Technical Training

LAS Links Online Administration Training 1 Coordinator and Proctor Training Agenda Training

Presentation Health and Employability Training 01 02 Health and Employability Training Health

TRAINING INTER STATE STUDY TOUR TO NDRI KARNAL TRAINING INTER STATE STUDY TOUR TO

1 Facilitator Training Facilitator Training Facilitator Training Facilitator Training 2

Potty Training in Potty Training in Potty Training in Potty Training in Four Days Four Days

2016 Open Day STRATEGIES FOR SUCCESS Foundation Support Strategies Sharon Hillcoat Head of

Effective Ventilation Strategies Effective Ventilation Strategies Effective Ventilation

BEHAVIOR @ HOME Simple Behavior Strategies Simple strategies that can make a big difference!

Today Uninformed search strategies Uninformed strategies use only the information available in the

Informed Search strategies AIMA sections 3.5, 3.6 Summary Informed Search strategies

A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1

Joint SVBRDF Recovery and Synthesis From a Single Image using an Unsupervised Generative

Introduction to Machine Learning ML-Basics: Losses & Risk Minimization Learning goals Know

Clustering and Dimensionality Reduction Stony Brook University CSE545, Fall 2016 Goal:

Linear Models CMPUT 366: Intelligent Systems P&M 7.3 Lecture Outline 1. Recap 2.

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

In-Database Machine Learning: Using Gradient Descent and Tensor Algebra Maximilian E. Schle,

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Colorado

Sambuz

Useful Links

Newsletter

Mail Us

Training Strategies CS 6355: Structured Prediction 1 So far we saw - PowerPoint PPT Presentation

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output prediction? Different ways for modeling structured prediction Conditional random fields, factor graphs, constraints What we only

Compliance Training 2012 Compliance Training 2012 Training Objectives Training Objectives

Uninformed Search strategies AIMA sections 3.4 Uninformed search strategies Uninformed Search

New Staff Training Training Site Development Training Site Development 2 Training Site

Product Features Technical Training 2007 Technical Training 2007 Technical Training 2007

Food Handler Training Food Handler Training Food Handler Training Food Handler Training Online

Service Section Service Section Technical Training Technical Training Technical Training

LAS Links Online Administration Training 1 Coordinator and Proctor Training Agenda Training

Presentation Health and Employability Training 01 02 Health and Employability Training Health

TRAINING INTER STATE STUDY TOUR TO NDRI KARNAL TRAINING INTER STATE STUDY TOUR TO

1 Facilitator Training Facilitator Training Facilitator Training Facilitator Training 2

Potty Training in Potty Training in Potty Training in Potty Training in Four Days Four Days

2016 Open Day STRATEGIES FOR SUCCESS Foundation Support Strategies Sharon Hillcoat Head of

Effective Ventilation Strategies Effective Ventilation Strategies Effective Ventilation

BEHAVIOR @ HOME Simple Behavior Strategies Simple strategies that can make a big difference!

Today Uninformed search strategies Uninformed strategies use only the information available in the

Informed Search strategies AIMA sections 3.5, 3.6 Summary Informed Search strategies

A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1

Joint SVBRDF Recovery and Synthesis From a Single Image using an Unsupervised Generative

Introduction to Machine Learning ML-Basics: Losses &amp; Risk Minimization Learning goals Know

Clustering and Dimensionality Reduction Stony Brook University CSE545, Fall 2016 Goal:

Linear Models CMPUT 366: Intelligent Systems P&amp;M 7.3 Lecture Outline 1. Recap 2.

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

In-Database Machine Learning: Using Gradient Descent and Tensor Algebra Maximilian E. Schle,

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Colorado

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Machine Learning ML-Basics: Losses & Risk Minimization Learning goals Know

Linear Models CMPUT 366: Intelligent Systems P&M 7.3 Lecture Outline 1. Recap 2.