Ac#veLearning October15,2009 ReadingtheWeb: - PowerPoint PPT Presentation

Ac#ve Learning October 15, 2009 Reading the Web: Advanced Sta#s#cal Language Processing ML 10‐709 Burr Se:les Machine Learning Department Carnegie Mellon University 1

Thought Experiment • suppose you’re the leader of an Earth convoy sent to colonize planet Mars people who ate the round people who ate the spiked Martian fruits found them tasty! Martian fruits died ! 2

Poison vs. Yummy Fruits • problem : there’s a range of spiky‐to‐round  fruit shapes on Mars: you need to learn the “threshold” of  roundness  where the fruits go from  poisonous to safe. and… you need to determine this risking  as  few colonists’ lives  as possible! 3

Tes#ng Fruit Safety… this is just a  binary search , so… under the PAC model, assume using the binary search we need O(1/ ε ) i.i.d. instances approach, we only needed O(log 2 1/ ε ) instances! to train a classifier with error ε . 4

Rela#onship to Ac#ve Learning • key idea:  the learner can choose training data – on Mars: whether a fruit was poisonous/safe – in general : the true label of some instance • goal:  reduce the training costs – on Mars: the number of “lives at risk” – in general : the number of “queries” 5

Ac#ve Learning Scenarios most common in NLP applications 6

Pool‐Based Ac#ve Learning Cycle label new instances, induce a model repeat inspect unlabeled data select “queries” 7

Learning Curves active learning passive learning better text classification: baseball vs. hockey 8

Who Uses Ac#ve Learning? Sentiment analysis for blogs; Noisy relabeling – Prem Melville Biomedical NLP & IR; Computer-aided diagnosis – Balaji Krishnapuram MS Outlook voicemail plug-in [Kapoor et al., IJCAI'07]; “A variety of prototypes that are in use throughout the company.” – Eric Horvitz “While I can confirm that we're using active learning in earnest on many problem areas… I really can't provide any more details than that. Sorry to be so opaque!" – David Cohn 9

How to Select Queries? • let’s try generalizing our binary search method  using a  probabilis.c  classifier: 1.0 0.5 0.5 0.5 0.0 10

[Lewis & Gale, SIGIR’94] Uncertainty Sampling • query instances the learner is  most uncertain  about 400 instances sampled random sampling active learning from 2 class Gaussians 30 labeled instances 30 labeled instances (accuracy=0.7) (accuracy=0.9) 11

Generalizing to Mul#‐Class Problems entropy [Dagan & Engelson, ICML’95] smallest-margin [Scheffer et al., CAIDA’01] least confident [Culotta & McCallum, AAAI’05] note:  for binary tasks, these are equivalent 12

[Körner & Wrobel, ECML’06] Mul#‐Class Uncertainty Measures entropy smallest margin illustration of preferred (darker) least posterior distributions in a confident 3-label classification task 13

[Seung et al., COLT’92] Query‐By‐Commi:ee (QBC) • train a commi:ee  C = { θ 1 , θ 2 , ..., θ C }  of  classifiers on the labeled data in  L • query instances in  U  for which the commi:ee  is in most  disagreement • key idea:  reduce the model  version space – expedites search for a model during training 14

QBC Example 15

QBC Example 16

QBC Example 17

QBC Example 18

QBC: Design Decisions • how to build a commi:ee: – “sample” models from  P( θ | L ) • [Dagan & Engelson, ICML’95; McCallum & Nigam, ICML’98] – standard ensembles (e.g., bagging, boos#ng) • [Abe & Mamitsuka, ICML’98] • how to measure disagreement: – “XOR” commi:ee classifica#ons – view vote distribu#on as probabili#es,  use uncertainty measures (e.g., entropy) 19

Uncertainty vs. QBC • QBC is a more  general  strategy, incorpora#ng  uncertainty over both: – instance label – model hypothesis • theore#cal guarantees… • QBC:  O(log 2 d/ ε )  query complexity  [Seung et al., ML ʼ 97] • uncertainty sampling: none 20

[Cohn et al., ML’94] Pathological Case for Uncertainty initial random sample fails uncertainty sampling only to hit the right triangle queries the left side! 21

[Cohn et al., ML’94] Version‐Space Sampling Instead 150 random samples 150 active queries (QBC variant) 22

Ac#ve vs. Semi‐Supervised • both try to a:ack the same problem: making  the most of unlabeled data  U • each a:acks from a different direc#on: – semi‐supervised  learning  exploits   what the model  thinks it knows about unlabeled data – ac;ve  learning  explores  the unknown aspects of  the unlabeled data 23

Ac#ve vs. Semi‐Supervised query-by-committee (QBC) uncertainty sampling use ensembles to rapidly query instances the model reduce the version space is least confident about self-training co-training expectation-maximization (EM) multi-view learning entropy regularization (ER) use ensembles with multiple views to constrain the version space propagate confident labelings among unlabeled data 24

Problem: Outliers • an instance may be uncertain or controversial  (for QBC) simply because it’s an  outlier • querying outliers is not likely to help us reduce  error on more typical data 25

Solu#on 1: Density Weigh#ng • weight the uncertainty (“informa#veness”) of an  instance by its density w.r.t. the pool  U   [Settles & Craven, EMNLP ʼ 08] “base” density informativeness term • use  U  to es#mate  P( x )  and avoid outliers [McCallum & Nigam, ICML’98; Nguyen & Smeulders, ICML’04; Xu et al., ECIR’07] 26

[Roy & McCallum, ICML’01; Zhu et al., ICML-WS’03] Solu#on 2: Es#mated Error Reduc#on • minimize the risk  R ( x )  of a query candidate – expected uncertainty over  U  if  x  is added to  L expectation over possible labelings of x sum over uncertainty of u unlabeled instances after retraining with x 27

[Roy & McCallum, ICML’01] Text Classifica#on Examples 28

[Roy & McCallum, ICML’01] Text Classifica#on Examples 29

Rela#onship to Uncertainty Sampling • a different perspec#ve: aim to maximize the  informa.on gain  over  U uncertainty before query risk term assume x is representative of U assume this evaluates to zero …reduces to uncertainty sampling! 30

“Error Reduc#on” Scoresheet • pros: – more principled query strategy – can be model‐agnos#c • literature examples: naïve Bayes, LR, GP, SVM • cons: – too expensive for most model classes • some solu#ons: subsample  U ; use approximate training – intractable for structured outputs 31

Alterna#ve Query Types • so far, we assumed queries are  instances – e.g., for document classifica#on the learner  queries  documents • can the learner do be:er by asking different  types of ques#ons? – mul.ple‐instance  ac#ve learning – feature  ac#ve learning 32

Mul#ple‐Instance (MI) Learning [TREC Genomics Track 2004] bag: document = { instances: paragraphs } • mul.ple instance  (MI) learning is one approach to  problems like this  [Die:erich et al., 1997] [Andrews et al., NIPS’03; Ray & Craven, ICML’05] 33

MI Ac#ve Learning • tradi#onal MI learning – high ambiguity vs. low cost • in some MI domains (e.g., text classifica#on), labels  can be obtained at the instance level – low ambiguity vs. high cost • MI ac#ve learning – obtain low‐cost bag labels, selec#vely query instances – reduce ambiguity  and  overall labeling cost 34

[Settles, Craven, & Ray NIPS’07] MI Uncertainty (MIU) • weight the uncertainty of an instance by its  “relevance” to the bag‐level output “base” “relevance” uncertainty doc 1 doc 2 term 0.8 0.3 0.4 0.4 0.4 0.9 par 1,1 0.2 0.5 0.1 par 1,2 35

[Settles, Craven, & Ray NIPS’07] MI Ac#ve Learning Results 36

Feature Ac#ve Learning • in NLP tasks, we can open intui#vely label  features – the feature word “ puck ” indicates the class  hockey – the feature word “ strike ” indicates the class  baseball • tandem learning  exploits this by asking both  instance‐label and feature‐relevance queries [Raghavan et al., JMLR’06] – e.g., “is  puck  an important discrimina#ve feature?” 37

Ac#veLearning October15,2009 ReadingtheWeb: - PowerPoint PPT Presentation

Ac#veLearning October15,2009 ReadingtheWeb: AdvancedSta#s#calLanguageProcessing ML10709 BurrSe:les MachineLearningDepartment CarnegieMellonUniversity 1 ThoughtExperiment

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

What is mobile learning, mobile learning policies and technologies Dr. Mohamed Ally Learning

Year 7 Learning Evening 2017 W elcome! Year 7 Learning Evening 2017 Year 7 Learning Evening

Learning is a never-ending process Tasks come and go, but learning is forever Learn more e ff

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

A Gentle Introduction to Machine Learning Supervised learning, unsupervised learning (very

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Learning From Data Lecture 2 The Perceptron The Learning Setup A Simple Learning Algorithm: PLA

Welcome to Welcome to The Learning Tree Workshop Series on Learning Differences, Learning

Impasse, Conflict Impasse, Conflict and Learning of CS Notions and Learning of CS Notions David

Foundations of AI Why learning works 1 6 . Statistical Machine Learning Bayesian Learning and

Why e Learning can actually be effective for learning an understanding from psycho

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

Objectives Objectives Objectives Objectives Learning Learning Learning Learning

Learning Sciences: Impact on Learning Technologies & Learning Activities Phillip D. Long,

Critical Reasoning for Beginners: Five Marianne Talbot Department for Continuing Education

caregivers with close relatives in Nursing Home (NH) Pr Anne-Sophie Rigaud, Catherine Bayle,

C ASE P REPARATION : part 1 Most debates are won and lost in the preparation room.

OBSERVING INTERACTION human-computer interaction CSE 440 WINTER 2015 University of FEB 12 -

Online Learning 9.520 Class, 19 March 2007 Sanmay Das (using some slides from Andrea Caponnetto)

Try it out from the Priority Inbox settings tab. Doug Aberdeen, Ond ej Pacovsk , Andrew

MIL-UT at ILSVRC2014 IIT Guwahati (undergrad) -> Virginia Tech (intern) Senthil Purushwalkam,

Setting the Emotional Tone: Managing Emotional Culture in the Library Jason Martin Walker

Ac#veLearning October15,2009 ReadingtheWeb: - PowerPoint PPT Presentation

Ac#veLearning October15,2009 ReadingtheWeb: AdvancedSta#s#calLanguageProcessing ML10709 BurrSe:les MachineLearningDepartment CarnegieMellonUniversity 1 ThoughtExperiment

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

What is mobile learning, mobile learning policies and technologies Dr. Mohamed Ally Learning

Year 7 Learning Evening 2017 W elcome! Year 7 Learning Evening 2017 Year 7 Learning Evening

Learning is a never-ending process Tasks come and go, but learning is forever Learn more e ff

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

A Gentle Introduction to Machine Learning Supervised learning, unsupervised learning (very

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Learning From Data Lecture 2 The Perceptron The Learning Setup A Simple Learning Algorithm: PLA

Welcome to Welcome to The Learning Tree Workshop Series on Learning Differences, Learning

Impasse, Conflict Impasse, Conflict and Learning of CS Notions and Learning of CS Notions David

Foundations of AI Why learning works 1 6 . Statistical Machine Learning Bayesian Learning and

Why e Learning can actually be effective for learning an understanding from psycho

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

Objectives Objectives Objectives Objectives Learning Learning Learning Learning

Learning Sciences: Impact on Learning Technologies &amp; Learning Activities Phillip D. Long,

Critical Reasoning for Beginners: Five Marianne Talbot Department for Continuing Education

caregivers with close relatives in Nursing Home (NH) Pr Anne-Sophie Rigaud, Catherine Bayle,

C ASE P REPARATION : part 1 Most debates are won and lost in the preparation room.

OBSERVING INTERACTION human-computer interaction CSE 440 WINTER 2015 University of FEB 12 -

Online Learning 9.520 Class, 19 March 2007 Sanmay Das (using some slides from Andrea Caponnetto)

Try it out from the Priority Inbox settings tab. Doug Aberdeen, Ond ej Pacovsk , Andrew

MIL-UT at ILSVRC2014 IIT Guwahati (undergrad) -&gt; Virginia Tech (intern) Senthil Purushwalkam,

Setting the Emotional Tone: Managing Emotional Culture in the Library Jason Martin Walker

Learning Sciences: Impact on Learning Technologies & Learning Activities Phillip D. Long,

MIL-UT at ILSVRC2014 IIT Guwahati (undergrad) -> Virginia Tech (intern) Senthil Purushwalkam,