Approximating Learning Curves for Active-Learning-Driven Annotation - PowerPoint PPT Presentation

Approximating Learning Curves for Active-Learning-Driven Annotation Katrin T omanek and Udo Hahn Jena University Language and Information Engineering (JULIE) Lab

Agenda ● Introduction to Active Learning ● Stopping Conditions ● Experiments & Results

Passive versus Active Selection passive annotation scenario (aka Random Sampling) annotation corpus selection labeled unlabeled

Passive versus Active Selection passive annotation scenario (aka Random Sampling) annotation corpus selection labeled unlabeled active annotation scenario (aka Active Learning) „intelligent“ example selection annotation labeled unlabeled

Committee-based AL Framework D 1 (P 1 ,P 2 ,P n ) u 1 : P 1 P 2 ... P n (3) calculate dis- ... ... agreement u k : P 1 P 2 ... P n D k (P 1 ,P 2 ,P n ) (4) (2) select by dis- predict U AL pool: agreement labels unlabeled examples (1) committee: (5) train on labeled C 1 C 2 C n annotate examples subsets ... (bagging)

Reduction of Annotation Effort learning curves

Reduction of Annotation Effort learning curves > 50% reduction F=0.83 60K tokens 130K tokens

When to Stop the Annotation ? learning curves

When to Stop the Annotation ? bad Could gain a lot by further annotation

When to Stop the Annotation ? bad bad Further annotation will not Could gain a lot increase classifier performance significantly by further annotation

When to Stop the Annotation ? bad better bad Further annotation will not Could gain a lot increase classifier performance significantly by further annotation

Stopping Condition based on Learning Curve ? Pro: stopping condition directly based on classifier ● performance Contra: requires labeled gold standard ● ➔ not applicable in practice as gold standard not available Goal: ● – Estimate the (progression of) learning curve without need for gold standard

Approximating the Learning Curve Approach: ● – Based on agreement among committee members – Does not require extra labeling effort – Agreement curve approximates progression of learning curve ➔ We can tell relative position in annotation process from it: ● relative trade-off between annotation effort and gain in classifier performance from it – Steep slope ? – Convergence ?

Approximating the Learning Curve Intuition: ● – Agreement among committee: ● Low in early AL iterations ● High in later ones ➔ When agreement among committee members converges, also learning curve does

Approximating the Learning Curve Where to calculate the agreement: ● – On separate validation set ● Not be involved in AL selection process itself ● Agreement values comparable over different AL iteration – Otherwise agreement curve often not reliable approximation due to „simulation dilemma“ ● When e.g. agreement calculated on examples selected in each AL iteration: – Approximation of learning curve usually works well in simulation scenarios, because... » few hard cases left in later AL iterations (perfect agreement) – But fails in real-world annotation scenarios, because... » in practice AL will always find tricky cases...

Experiments For annotation of Named Entity mentions ● Whole sentences selected (20 each round) ● Simulation on CoNLL-2003 corpus ● – News-paper, MUC entities (PERS, LOC, ORG) – AL pool: ~ 14,000 sentences – Gold Standard: ~ 3,500 sentences ● For learning curve ● For agreement curve (labels ignored)

Results learning curves agreement curve

Summary & Conclusions AL has high potential to reduce annotation effort ● Proper stopping point necessary to profit from savings ● ➔ Method to monitor progress of annotation needed Agreement curve ● – Works well: good approximation of learning curve – No extra annotation effort: does not require labeled gold standard

Approximating Learning Curves for Active-Learning-Driven Annotation Thanks. Questions ? http://www.julielab.de/

Approximating Learning Curves for Active-Learning-Driven Annotation - PowerPoint PPT Presentation

Approximating Learning Curves for Active-Learning-Driven Annotation Katrin T omanek and Udo Hahn Jena University Language and Information Engineering (JULIE) Lab Agenda Introduction to Active Learning Stopping Conditions

Bezier curves Bezier curves Control points Bezier curves Control points Bezier curves Bezier

Evaluation of Classifiers Evaluation of Classifiers ROC Curves ROC Curves Reject Curves Reject

Neatening sketched strokes using piecewise French Curves James McCrae, Karan Singh French Curves

Curves and Surfaces Curves and Surfaces Parametric Representations Parametric Representations

Forms of elliptic curves Wouter Castryck Forms of elliptic curves First definitions Well-known

parametric spline curves 1 curves used in many contexts fonts (2D) animation paths (3D) shape

BEZIER CURVES 1 OUTLINE Introduce types of curves and surfaces Introduce the types of

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Agenda Intro to Active Learning Activity Design Resources for Active Learning Lunch with Active

Approximating Pareto Curves using Semidefinite Relaxations Victor MAGRON Postdoc LAAS-CNRS

08 - Designing Approximating Curves Acknowledgement: Olga Sorkine-Hornung, Alexander

Function Fields, Curves Introduction Function Fields vs. Curves and Global sections Function

Smooth models for Suzuki and Ree Curves Abdulla Eid RICAM Workshop Algebraic curves over finite

Curves http://www.ugrad.cs.ubc.ca/~cs314/Vjan2013 Reading FCG Chap 15 Curves Ch 13 2nd

Khem Raj Embedded Linux Conference 2014, San Jose, CA } What is GCC } General Optimizations

Computer Vision Exercise Session 1 Institute of Visual Computing Organization Teaching

Is Ireland Ready for a Green New Deal? Donal Nevin Annual Lecture, NERI Institute Dr. Lorna Gold

Update on LBNF Status to the Long-Baseline Neutrino Committee Chris Mossey, Deputy Director for

Debate Technology for Empowering the Public: Insights and Avenues ? Dr. Annette Hautli-Janisz

Quantum ESPRESSO on GPU accelerated systems Massimiliano Fatica , Everett Phillips, Josh Romero -

Course on Automated Planning: Intro to Planning Hector Geffner ICREA & Universitat Pompeu

Smart Contracts and Ethereum Winter School on Cryptocurrency Loi Luu and Blockchain Technologies

Sambuz

Useful Links

Newsletter

Mail Us

Approximating Learning Curves for Active-Learning-Driven Annotation - PowerPoint PPT Presentation

Approximating Learning Curves for Active-Learning-Driven Annotation Katrin T omanek and Udo Hahn Jena University Language and Information Engineering (JULIE) Lab Agenda Introduction to Active Learning Stopping Conditions

Bezier curves Bezier curves Control points Bezier curves Control points Bezier curves Bezier

Evaluation of Classifiers Evaluation of Classifiers ROC Curves ROC Curves Reject Curves Reject

Neatening sketched strokes using piecewise French Curves James McCrae, Karan Singh French Curves

Curves and Surfaces Curves and Surfaces Parametric Representations Parametric Representations

Forms of elliptic curves Wouter Castryck Forms of elliptic curves First definitions Well-known

parametric spline curves 1 curves used in many contexts fonts (2D) animation paths (3D) shape

BEZIER CURVES 1 OUTLINE Introduce types of curves and surfaces Introduce the types of

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Agenda Intro to Active Learning Activity Design Resources for Active Learning Lunch with Active

Approximating Pareto Curves using Semidefinite Relaxations Victor MAGRON Postdoc LAAS-CNRS

08 - Designing Approximating Curves Acknowledgement: Olga Sorkine-Hornung, Alexander

Function Fields, Curves Introduction Function Fields vs. Curves and Global sections Function

Smooth models for Suzuki and Ree Curves Abdulla Eid RICAM Workshop Algebraic curves over finite

Curves http://www.ugrad.cs.ubc.ca/~cs314/Vjan2013 Reading FCG Chap 15 Curves Ch 13 2nd

Khem Raj Embedded Linux Conference 2014, San Jose, CA } What is GCC } General Optimizations

Computer Vision Exercise Session 1 Institute of Visual Computing Organization Teaching

Is Ireland Ready for a Green New Deal? Donal Nevin Annual Lecture, NERI Institute Dr. Lorna Gold

Update on LBNF Status to the Long-Baseline Neutrino Committee Chris Mossey, Deputy Director for

Debate Technology for Empowering the Public: Insights and Avenues ? Dr. Annette Hautli-Janisz

Quantum ESPRESSO on GPU accelerated systems Massimiliano Fatica , Everett Phillips, Josh Romero -

Course on Automated Planning: Intro to Planning Hector Geffner ICREA &amp; Universitat Pompeu

Smart Contracts and Ethereum Winter School on Cryptocurrency Loi Luu and Blockchain Technologies

Sambuz

Useful Links

Newsletter

Mail Us

Course on Automated Planning: Intro to Planning Hector Geffner ICREA & Universitat Pompeu