Bo osting Neural Net w orks pap er No Holger Sc h w - PDF document

Bo osting Neural Net w orks pap er No �� Holger Sc h w enk LIMSI�CNRS� bat �� BP �� Orsa y cedex� FRANCE Y osh ua Bengio DIR O� Univ ersit y of Mon tr � eal� Succ� Cen tre�Ville� CP �� Mon tr � eal� Qc� H�C �J�� CANAD A T o app ear in Neural Computation Abstract �Bo osting� is a general metho d for impro ving the p erformance of learning algorithms� A recen tly prop osed b o osting algorithm is A daBo ost � It has b een applied with great success to sev eral b enc hmark mac hine learning problems using mainly decision trees as base classi�ers� In this pap er w e in v estigate whether AdaBo ost also w orks as w ell with neural net w orks� and w e discuss the adv an tages and dra wbac ks of di�eren t v ersions of the AdaBo ost algorithm� In particular� w e compare training metho ds based on sampling the training set and w eigh ting the cost function� The results suggest that random resampling of the training data is not the main explanation of the success of the impro v emen ts brough t b y AdaBo ost� This is in con trast to Bagging whic h directly aims at reducing v ariance and for whic h random resampling is essen tial to obtain the reduction in generalization error� Our system ac hiev es ab out �� error on a data set of online handwritten digits from more than �� writers� A b o osted m ulti�la y er net w ork ac hiev ed �� error on the UCI Letters and �� error on the UCI satellite data set� whic h is signi�can tly b etter than b o osted decision trees� Keyw ords� AdaBo ost� b o osting� Bagging� ensem ble learning� m ulti�la y er neural net w orks� generalization �

� In tro duction �Bo osting� is a general metho d for impro ving the p erformance of a learning algorithm� It is a metho d for �nding a highly accurate classi�er on the training set� b y com bining �w eak h yp otheses� �Sc hapire� �� eac h of whic h needs only to b e mo derately accurate on the training set� See an earlier o v erview of di�eren t w a ys to com bine neural net w orks in �P errone� �� A recen tly prop osed b o osting algorithm is A daBo ost �F reund� �� whic h stands for �Adaptiv e Bo osting�� During the last t w o y ears� man y empirical studies ha v e b een published that use decision trees as base classi�ers for AdaBo ost �Breiman� �� Druc k er and Cortes� �� F reund and Sc hapire� ��a� Quinlan� �� Maclin and Opitz� �� Bauer and Koha vi� �� Dietteric h� ��b� Gro v e and Sc h uurmans� �� All these exp erimen ts ha v e sho wn impressiv e impro v emen ts in the generalization b eha vior and suggest that AdaBo ost tends to b e robust to o v er�tting� In fact� in man y exp erimen ts it has b een observ ed that the generalization error con tin ues to decrease to w ards an apparen t asymptote after the training error has reac hed zero� �Sc hapire et al�� suggest a p ossible explanation for this un usual b eha vior based on the de�nition of the mar gin of classi�c ation � Other attemps to understand b o osting theoretically can b e found in �Sc hapire et al�� Breiman� ��a� Breiman� �� F riedman et al�� Sc hapire� �� AdaBo ost has also b een link ed with game theory �F reund and Sc hapire� ��b� Breiman� ��b� Gro v e and Sc h uurmans� �� F reund and Sc hapire� �� in order to understand the b eha vior of AdaBo ost and to prop ose alternativ e algorithms� �Mason and Baxter� �� prop ose a new v arian t of b o osting based on the direct optimization of margins� Additionally � there is recen t evidence that AdaBo ost ma y v ery w ell o v er�t if w e com bine sev eral h undred thousand classi�ers �Gro v e and Sc h uurmans� �� It also seems that the p erformance of AdaBo ost degrades a lot in the presence of signi�can t amoun ts of noise �Dietteric h� ��b� R� atsc h et al�� Although m uc h useful w ork has b een done� b oth theoretically and exp erimen tally � there is still a lot that is not w ell understo o d ab out the impressiv e generalization b eha vior of AdaBo ost� T o the b est of our kno wledge� applications of AdaBo ost ha v e all b een to decision trees� and no applications to m ulti�la y er arti�cial neural net w orks ha v e b een rep orted in the literature� This pap er extends and pro vides a deep er exp erimen tal analysis of our �rst exp erimen ts with the application of AdaBo ost to neural net w orks �Sc h w enk and Bengio� �� Sc h w enk and Bengio� �� In this pap er w e consider the follo wing questions� do es AdaBo ost w ork as w ell for neural net w orks as for decision trees� short answ er� y es� sometimes ev en b etter� Do es it b eha v e in a similar w a y �as w as observ ed previously in the literature�� short answ er� y es� F urthermore� are there particulars in the w a y neural net w orks are trained with gradien t bac k�propagation whic h should b e tak en in to accoun t when c ho osing a particular v ersion of AdaBo ost� short answ er� y es� b ecause it is p ossible to directly w eigh t the cost function of neural net w orks� Is o v er�tting of the individual neural net w orks a concern� short answ er� not as m uc h as when not using b o osting� Is the random resampling used in previous implemen tations of AdaBo ost critical or can w e get similar p erformances b y w eighing the training criterion �whic h can easily b e done with neural net w orks�� short answ er� it is not critical for generalization but helps �

Bo osting Neural Net w orks pap er No Holger Sc h w - PDF document

Bo osting Neural Net w orks pap er No Holger Sc h w enk LIMSICNRS bat BP Orsa y cedex FRANCE Y osh ua Bengio DIR O Univ ersit y of Mon tr

Introduction to Data Science CS 5963 / Math 3900 Alexander Lex Braxton Osting alex@sci.utah.edu

The Raspberry Pi: A Platform for Replicable Performance Benchmarks? Holger Knoche and Holger

PAP Report 2016 PAP Board of Directors Arsenio S. Alianan, Jr., Ateneo de Manila University

PAP: Power Aware Partitioning of Reconfigurable Systems Vijay R. P. Kappagantula Rabi Mahapatra

Articial Neural Net w orks [Read Ch. 4] [Recommended exercises 4.1, 4.2, 4.5, 4.9,

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Holger Hermanns Holger Hermanns E le c tr ic Powe r Distr ibution Gr ids Pr Pr oduc e r

The new EU sport policy: The new EU sport policy: The new EU sport policy: The new EU sport

Introd u ction to Net w orks IN TR OD U C TION TO N E TW OR K AN ALYSIS IN P YTH ON Eric Ma

A generalized MBO diffusion generated method for constrained harmonic maps Braxton Osting

1 1 1 1 1 1 1 1 1 1 1 1 1 1 Jerome F riedman T rev or Hastie Rob

Modular Neural Networks CPSC 533 Franco Lee Ian Ko Modular Neural Networks What is it ? Dif

1 Professional Assistance Program (PAP) Mission To Protect the Public and Support the

Geometric analog of Green-Taos PAP theorem Chunlei Liu Shanghai Jiao Tong University Chunlei

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

CS145: INTRODUCTION TO DATA MINING 6: Vector Data: Neural Network Instructor: Yizhou Sun

IAML: Artificial Neural Networks Chris Williams and Victor Lavrenko School of Informatics

Introduction to Artificial Intelligence Lirong Xia Thursday, January 18, 2018 Basic information

Artificial Intelligence Artificial Intelligence Course: CS40002 Course: CS40002 Instructor: Dr.

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

COMP24111: Machine Learning and Optimisation Chapter 5: Neural Networks and Deep Learning Dr.

Introduction to Deep Learning M S Ram Dept. of Computer Science & Engg. Indian Institute of

Reinforcement Learning Part 2 CS 760@UW-Madison Goals for the lecture you should understand the

Bo osting Neural Net w orks pap er No Holger Sc h w - PDF document

Bo osting Neural Net w orks pap er No Holger Sc h w enk LIMSICNRS bat BP Orsa y cedex FRANCE Y osh ua Bengio DIR O Univ ersit y of Mon tr

Introduction to Data Science CS 5963 / Math 3900 Alexander Lex Braxton Osting alex@sci.utah.edu

The Raspberry Pi: A Platform for Replicable Performance Benchmarks? Holger Knoche and Holger

PAP Report 2016 PAP Board of Directors Arsenio S. Alianan, Jr., Ateneo de Manila University

PAP: Power Aware Partitioning of Reconfigurable Systems Vijay R. P. Kappagantula Rabi Mahapatra

Articial Neural Net w orks [Read Ch. 4] [Recommended exercises 4.1, 4.2, 4.5, 4.9,

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Holger Hermanns Holger Hermanns E le c tr ic Powe r Distr ibution Gr ids Pr Pr oduc e r

The new EU sport policy: The new EU sport policy: The new EU sport policy: The new EU sport

Introd u ction to Net w orks IN TR OD U C TION TO N E TW OR K AN ALYSIS IN P YTH ON Eric Ma

A generalized MBO diffusion generated method for constrained harmonic maps Braxton Osting

1 1 1 1 1 1 1 1 1 1 1 1 1 1 Jerome F riedman T rev or Hastie Rob

Modular Neural Networks CPSC 533 Franco Lee Ian Ko Modular Neural Networks What is it ? Dif

1 Professional Assistance Program (PAP) Mission To Protect the Public and Support the

Geometric analog of Green-Taos PAP theorem Chunlei Liu Shanghai Jiao Tong University Chunlei

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

CS145: INTRODUCTION TO DATA MINING 6: Vector Data: Neural Network Instructor: Yizhou Sun

IAML: Artificial Neural Networks Chris Williams and Victor Lavrenko School of Informatics

Introduction to Artificial Intelligence Lirong Xia Thursday, January 18, 2018 Basic information

Artificial Intelligence Artificial Intelligence Course: CS40002 Course: CS40002 Instructor: Dr.

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

COMP24111: Machine Learning and Optimisation Chapter 5: Neural Networks and Deep Learning Dr.

Introduction to Deep Learning M S Ram Dept. of Computer Science &amp; Engg. Indian Institute of

Reinforcement Learning Part 2 CS 760@UW-Madison Goals for the lecture you should understand the

Introduction to Deep Learning M S Ram Dept. of Computer Science & Engg. Indian Institute of