Boosting Machine Learning - 10601 Geoff Gordon, MiroslavDudk - PDF document

Jan 17, 2024 •14 likes •184 views

11/9/2009 Boosting Machine Learning - 10601 Geoff Gordon, MiroslavDudk ([[[partly based on slides of Rob Schapire and Carlos Guestrin] http://www.cs.cmu.edu/~ggordon/10601/ November 9, 2009 Ensembles of trees BAGGING and BOOSTING RANDOM

11/9/2009 Boosting Machine Learning - 10601 Geoff Gordon, MiroslavDudík ([[[partly based on slides of Rob Schapire and Carlos Guestrin] http://www.cs.cmu.edu/~ggordon/10601/ November 9, 2009 Ensembles of trees BAGGING and BOOSTING RANDOM FORESTS • learn many big trees • learn many small trees ( weak classifiers) • each tree aims to fit • each tree ‘specializes’ to a the same target concept different part of target – random training sets concept – randomized tree growth – reweight training examples – higher weights where still errors • voting ≈ averaging: • voting increases expressivity: DECREASE in VARIANCE DECREASE in BIAS 1
11/9/2009 Boosting • boosting = general method of converting rough rules of thumb (e.g., decision stumps) into highly accurate prediction rule Boosting • boosting = general method of converting rough rules of thumb (e.g., decision stumps) into highly accurate prediction rule • technically: – assume given “weak” learning algorithm that can consistently find classifiers ( “rules of thumb” ) at least slightly better than random, say, accuracy ≥ 55% (in two-class setting) – given sufficient data, a boosting algorithm can provably construct single classifier with very high accuracy , say, 99% 2
11/9/2009 AdaBoost [Freund-Schapire 1995] 3
11/9/2009 weak classifiers = decision stumps (vertical or horizontal half-planes) 4
11/9/2009 5
11/9/2009 A typical run of AdaBoost • training error rapidly drops (combining weak learners increases expressivity) • test error does not increase with number of trees T (robustness to overfitting) 6
11/9/2009 7
11/9/2009 Bounding true error [Freund-Schapire 1997] • T = number of rounds • d = VC dimension of weak learner • m = number of training examples 8
11/9/2009 Bounding true error (a first guess) A typical run contradicts a naïve bound 9
11/9/2009 Finer analysis: margins [Schapire et al. 1998] Empirical evidence: margin distribution 10
11/9/2009 Theoretical evidence: large margins  simple classifiers Previously More technically… Bound depends on: • d = VC dimension of weak learner • m = number of training examples • entire distribution of training margins 11
11/9/2009 Practical advantages of AdaBoost Application: detecting faces [Viola-Jones 2001] 12
11/9/2009 Caveats “Hard” predictions can slow down learning! 13
11/9/2009 Confidence-rated Predictions [Schapire-Singer 1999] Confidence-rated Predictions 14
11/9/2009 Confidence-rated predictions help a lot! Loss in logistic regression 15
11/9/2009 Loss in AdaBoost Logistic regression vs AdaBoost 16
11/9/2009 Benefits of model-fitting view What you should know about boosting • weak classifiers  strong classifiers – weak: slightly better than random on training data – strong: eventually zero error on training data • AdaBoost prevents overfitting by increasing margins • regimes when AdaBoost overfits – weak learner too strong: use small trees or stop early – data noisy: stop early • AdaBoost vs Logistic Regression – exponential loss vs log loss – single-coordinate updates vs full optimization 17

Recommend

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Rob Schapire Example: Spam Filtering Example: Spam

456 views • 35 slides

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications of Boosting Rob Schapire Princeton University Example: How May I Help

1.62k views • 104 slides

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

1.46k views • 134 slides

RECSM Summer School: Machine Learning for Social Sciences Session 2.4: Boosting Reto West

RECSM Summer School: Machine Learning for Social Sciences Session 2.4: Boosting Reto West Department of Political Science and International Relations University of Geneva 1 Boosting Boosting Like bagging, boosting is a general

374 views • 13 slides

Lecture #16: Boosting Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas

Lecture #16: Boosting Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas Kevin Rader Rahul Dave Margo Levine Lecture Outline Review Boosting Algorithms Gradient Boosting Relation to Gradient Descent AdaBoost 2 Review

669 views • 35 slides

An overview of Boosting Yoav Freund UCSD Plan of talk Generative vs. non-generative

An overview of Boosting Yoav Freund UCSD Plan of talk Generative vs. non-generative modeling Boosting Alternating decision trees Boosting and over-fitting Applications 2 Toy Example Computer receives telephone call

735 views • 61 slides

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Introduction Multiclass Boosting Repartitioning Experiments Summary Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006 Introduction Multiclass Boosting Repartitioning Experiments Summary Binary

545 views • 23 slides

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble models) LEARNING PERFORMANCE REPRESENTATION DATA PROBLEM RAW DATA CLUSTERING EVALUATION FEATURES UCI datasets unigrams 20newsgroups SUPERVISED

926 views • 34 slides

The Boosting Approach to Machine Learning Maria-Florina Balcan 03/16/2015 Boosting General

The Boosting Approach to Machine Learning Maria-Florina Balcan 03/16/2015 Boosting General method for improving the accuracy of any given learning algorithm. Works by creating a series of challenge datasets s.t. even modest

715 views • 32 slides

ECON 950 Winter 2020 Prof. James MacKinnon 7. Boosting Like bagging and random forests,

ECON 950 Winter 2020 Prof. James MacKinnon 7. Boosting Like bagging and random forests, boosting involves creating many models. Unlike bagging and random forests, boosting creates these models sequentially, and there is no resampling

258 views • 10 slides

mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten

mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten Hothorn Department of Statistics Ludwig-Maximilians-University Munich 13.8.2008 Thomas Kneib Boosting in a Nutshell Boosting in a Nutshell

414 views • 17 slides

Boosting Methods: Implicit Combinatorial Optimization via First-Order Convex Optimization Robert

Motivation Review of SD and FW(CG) Binary Classification and Boosting Linear Regression and Boosting Boosting Methods: Implicit Combinatorial Optimization via First-Order Convex Optimization Robert M. Freund Paul Grigas Rahul Mazumder

707 views • 49 slides

STK-IN4300 Statistical Learning Methods in Data Science Likelihood-based Boosting introduction

STK-IN4300 - Statistical Learning Methods in Data Science Outline of the lecture Gradient Boosting review L 2 boosting with linear learner STK-IN4300 Statistical Learning Methods in Data Science Likelihood-based Boosting introduction

341 views • 11 slides

STK-IN4300 Statistical Learning Methods in Data Science Statistical Boosting Boosting as a

STK-IN4300 - Statistical Learning Methods in Data Science Outline of the lecture AdaBoost Introduction algorithm STK-IN4300 Statistical Learning Methods in Data Science Statistical Boosting Boosting as a forward stagewise additive modelling

211 views • 8 slides

Lecture 17: Boosting CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader

Lecture 17: Boosting CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader CS109B, STAT121B, AC209B CS109A, P ROTOPAPAS , R ADER 2 Outline Review of Ensemble Methods Finish Random Forest Boosting Gradient

1.09k views • 54 slides

XGBOOST: A SCALABLE TREE BOOSTING SYSTEM ADVISOR: JIA-LING KOH SPEAKER: YIN-HSIANG LIAO

XGBOOST: A SCALABLE TREE BOOSTING SYSTEM ADVISOR: JIA-LING KOH SPEAKER: YIN-HSIANG LIAO 2018/04/17, FROM KDD 2016 Outline Introduction Method Experiment Conclusion 2 Introduction Regression tree CART (Gini) Boosting Ensemble method, an

689 views • 33 slides

Framework with a TR Focus Thursday June 1st, 2017 3:00 4:00 p.m. T14 Provincial TR Stroke

Provincial Interprofessional Stroke Core Competency Framework with a TR Focus Thursday June 1st, 2017 3:00 4:00 p.m. T14 Provincial TR Stroke Core Competencies Presenters Keli Cristofaro R/TRO Holly Graham R/TRO Stroke Community

531 views • 22 slides

Fixed Point Theorem and Character Formula Hang Wang University of Adelaide Index Theory and

Fixed Point Theorem and Character Formula Hang Wang University of Adelaide Index Theory and Singular Structures Institut de Math ematiques de Toulouse 29 May, 2017 Outline Aim: Study representation theory of Lie groups from the point of

818 views • 23 slides

The Covering Spectrum and Isospectrality AMS Special Session: Inverse Problems in Geometry Ruth

Outline Covering Spectrum Sunada Isospectral Manifolds Group Theory and the Covering Spectrum Riemann Surfaces and the Covering Spectrum Ending Comments and Example The Covering Spectrum and Isospectrality AMS Special Session: Inverse

811 views • 64 slides

New examples of totally disconnected locally compact groups Murray Elder, George Willis GACGTA

New examples of totally disconnected locally compact groups Murray Elder, George Willis GACGTA 2012, D usseldorf A topological space X is Hausdorff if for each x = y there are disjoint open sets, one containing x and the other y locally

1.1k views • 20 slides

SQL Injecon We have menoned security a few *mes now

SQL Injec*on We have men*oned security a few *mes now The most important thing to realize as a Web developer / administrator is that hackers

576 views • 10 slides

Neo4j @ Gamesys Toby ORourke March 2013 Overview About Gamesys Our

Neo4j @ Gamesys Toby ORourke March 2013 Overview About Gamesys Our Use Case Choosing a Store Building the App Deployment & Ops Cypher &

369 views • 12 slides

Population Health @ HCGH Prepared for MDH Population Health Summit, 12/4/18 Elizabeth Edsall

Population Health @ HCGH Prepared for MDH Population Health Summit, 12/4/18 Elizabeth Edsall Kromm, PhD, MSc VP Population Health & Advancement, HCGH 1 Discussion Agenda Developing a community health strategy Value of aligned

324 views • 11 slides

Housing Supportive Services in Californias Whole Person Care Pilots and ACA 2703 Health Home

Housing Supportive Services in Californias Whole Person Care Pilots and ACA 2703 Health Home Program Brian Hansen Health Program Specialist II - Health Care Delivery Systems California Department of Health Care Services July 24, 2017 2

574 views • 16 slides

Boosting Machine Learning - 10601 Geoff Gordon, MiroslavDudk - PDF document

11/9/2009 Boosting Machine Learning - 10601 Geoff Gordon, MiroslavDudk ([[[partly based on slides of Rob Schapire and Carlos Guestrin] http://www.cs.cmu.edu/~ggordon/10601/ November 9, 2009 Ensembles of trees BAGGING and BOOSTING RANDOM

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

RECSM Summer School: Machine Learning for Social Sciences Session 2.4: Boosting Reto West

Lecture #16: Boosting Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas

An overview of Boosting Yoav Freund UCSD Plan of talk Generative vs. non-generative

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

The Boosting Approach to Machine Learning Maria-Florina Balcan 03/16/2015 Boosting General

ECON 950 Winter 2020 Prof. James MacKinnon 7. Boosting Like bagging and random forests,

mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib &amp; Torsten

Boosting Methods: Implicit Combinatorial Optimization via First-Order Convex Optimization Robert

STK-IN4300 Statistical Learning Methods in Data Science Likelihood-based Boosting introduction

STK-IN4300 Statistical Learning Methods in Data Science Statistical Boosting Boosting as a

Lecture 17: Boosting CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader

XGBOOST: A SCALABLE TREE BOOSTING SYSTEM ADVISOR: JIA-LING KOH SPEAKER: YIN-HSIANG LIAO

Framework with a TR Focus Thursday June 1st, 2017 3:00 4:00 p.m. T14 Provincial TR Stroke

Fixed Point Theorem and Character Formula Hang Wang University of Adelaide Index Theory and

The Covering Spectrum and Isospectrality AMS Special Session: Inverse Problems in Geometry Ruth

New examples of totally disconnected locally compact groups Murray Elder, George Willis GACGTA

SQL Injec*on We have men*oned security a few *mes now

Neo4j @ Gamesys Toby ORourke March 2013 Overview About Gamesys Our

Population Health @ HCGH Prepared for MDH Population Health Summit, 12/4/18 Elizabeth Edsall

Housing Supportive Services in Californias Whole Person Care Pilots and ACA 2703 Health Home

mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten

SQL Injecon We have menoned security a few *mes now