The bits the whirlwind tour left out ... BMVA Summer School 2016 - PowerPoint PPT Presentation

The bits the whirlwind tour left out ... BMVA Summer School 2016 – extra background slides (from teaching material at Durham University) BMVA Summer School 2016 Machine Learning Extra : 1

Machine Learning  Definition: – “ A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P , if its performance at tasks T, improves with experience E .” [Mitchell, 1997] BMVA Summer School 2016 Machine Learning Extra : 2

Algorithm to construct decision trees …. BMVA Summer School 2016 Machine Learning Extra : 3

Building Decision Trees – ID3  node = root of tree  Main loop: A = “best” decision attribute for next node ..... But which attribute is best to split on ? BMVA Summer School 2016 Machine Learning Extra : 4

Entropy in machine learning  Entropy : a measure of impurity – S is a sample of training examples – P  is the proportion of positive examples in S – P ⊖ is the proportion of negative examples in S  Entropy measures the impurity of S: BMVA Summer School 2016 Machine Learning Extra : 5

Information Gain – reduction in Entropy  Gain(S,A) = expected reduction in entropy due to splitting on attribute A – i.e. expected reduction in impurity in the data – (improvement in consistent data sorting) BMVA Summer School 2016 Machine Learning Extra : 6

Information Gain – reduction in Entropy – reduction in entropy in set of examples S if split on attribute A – S v = subset of S for which attribute A has value v – Gain(S,A) = original entropy – SUM(entropy of sub-nodes if split on A) BMVA Summer School 2016 Machine Learning Extra : 7

Information Gain – reduction in Entropy  Information Gain : – “information provided about the target function given the value of some attribute A” – How well does A sort the data into the required classes?  Generalise to c classes : – (not just  or ⊖) c Entropy  S =− ∑ p i log p i i = 1 BMVA Summer School 2016 Machine Learning Extra : 8

Building Decision Trees  Selecting the Next Attribute – which attribute should we split on next? BMVA Summer School 2016 Machine Learning Extra : 9

Building Decision Trees  Selecting the Next Attribute – which attribute should we split on next? BMVA Summer School 2016 Machine Learning Extra : 10

Boosting and Bagging …. + Forests BMVA Summer School 2016 Machine Learning Extra : 11

Learning using Boosting Learning Boosted Classifier (Adaboost Algorithm) Assign equal weight to each training instance For t iterations: Apply learning algorithm to weighted training set, store resulting (weak) classifier Compute classifier’s error e on weighted training set If e = 0 or e > 0.5: Terminate classifier generation For each instance in training set: If classified correctly by classifier: Multiply instance’s weight by e /(1- e ) Normalize weight of all instances e = error of classifier on the training set Classification using Boosted Classifier Assign weight = 0 to all classes For each of the t (or less) classifiers : For the class this classifier predicts add –log e /(1- e ) to this class’s weight Return class with highest weight Toby Breckon Lecture 4 : 12

Learning using Boosting  Some things to note: – Weight adjustment means t+1 th classifier concentrates on the examples t th classifier got wrong – Each classifier must be able to achieve greater than 50% success • (i.e. 0.5 in normalised error range {0..1}) – Results in an ensemble of t classifiers • i.e. a boosted classifier made up of t weak classifiers • boosting/bagging classifiers often called ensemble classifiers – Training error decreases exponentially (theoretically) • prone to over-fitting (need diversity in test set) – several additions/modifications to handle this – Works best with weak classifiers .....  Boosted Trees – set of t decision trees of limited complexity (e.g. depth) Toby Breckon Lecture 4 : 13

Decision Forests (a.k.a. Random Forests/Trees)  Bagging using multiple decision trees where each tree in the ensemble classifier ... – is trained on a random subsets of the training data – computes a node split on a random subset of the available attributes [Breiman 2001]  Each tree is grown as follows: – Select a training set T' (size N) by randomly selecting (with replacement) N instances from training set T – Select a number m < M where a subset of m attributes out of the available M attributes are used to compute the best split at a given node (m is constant across all trees in the forest) – Grow each tree using T' to the largest extent possible without any pruning. Toby Breckon Lecture 4 : 14

Backpropogation Algorithm …. BMVA Summer School 2016 Machine Learning Extra : 15

Backpropagation Algorithm  Assume we have: Output vector, O k – input examples d={1...D} • each is pair {x d ,t d } = {input node index {1 … N} Output vector, target vector} Layer – node index n={1 … N} Hidden Layer – weight w ji connects node j → i – input x ji is the input on the connection node j → i Input layer • corresponding weight = w ji – output error for node n is δ n Input, x • similar to (o – t) BMVA Summer School 2016 Machine Learning Extra : 16

Backpropagation Algorithm (1) Input Example example d (2) output layer error based on : difference between output and targe t (t - o) derivative of sigmoid function (3) Hidden layer error proportional to node contribution to output error (4) Update weights w ij – BMVA Summer School 2016 Machine Learning Extra : 17

Backpropagation  Termination criteria – number of iterations reached – Or error below suitable bound  Output layer error  Hidden layer error  Add weights updated using relevant error BMVA Summer School 2016 Machine Learning Extra : 18

Backpropagation Output vector, O k Output Layer, unit k Hidden Layer, unit h Input layer Input, x BMVA Summer School 2016 Machine Learning Extra : 19

Backpropagation Output vector, O k Output Layer, unit k Hidden Layer, unit h Input layer δ h is expressed as a weighted sum of the output layer errors δ k Input, x to which it contributes (i.e. w hk > 0 ) BMVA Summer School 2016 Machine Learning Extra : 20

Backpropagation Output vector, O k  Error is propogated backwards from network Output output .... Layer, unit k to weights of output layer Hidden .... Layer, unit h to weights of the hidden layer … Input layer Input, x  Hence the name: backpropagation BMVA Summer School 2016 Machine Learning Extra : 21

Backpropagation Output vector, O k Repeat these stages for every Output hidden layer in a Layer, unit k multi-layer network: (using error δ i where x ji >0 ) Hidden ....... Layer(s), unit h Input layer Input, x BMVA Summer School 2016 Machine Learning Extra : 22

Backpropagation Output vector, O k  Error is propogated backwards from network Output output .... Layer, unit k to weights of output layer .... over weights of all N Hidden ....... Layer(s), hidden layers unit h … Input layer  Hence the name: backpropagation Input, x BMVA Summer School 2016 Machine Learning Extra : 23

Backpropagation  Will perform gradient descent over the weight space of {w ji } for all connections i → j in the network  Stochastic gradient descent – as updates based on training one sample at a time BMVA Summer School 2016 Machine Learning Extra : 24

Future and current concepts This is beyond the scope this introductory tutorial but the following are recommended as good places to start:  Convolutional Neural Networks – http://deeplearning.net/tutorial/lenet.html  Deep Learning – http://www.deeplearning.net/tutorial/ BMVA Summer School 2016 Machine Learning Extra : 25

Understanding (and believing) the SVM stuff …. BMVA Summer School 2016 Machine Learning Extra : 26

2D LINES REMINDER Remedial Note: equations of 2D lines  Line: Normal to line Offset from origin where: are 2D vectors. BMVA Summer School 2016 Machine Learning Extra : 27

2D LINES REMINDER Remedial Note: equations of 2D lines http://www.mathopenref.com/coordpointdisttrig.html BMVA Summer School 2016 Machine Learning Extra : 28

2D LINES REMINDER Remedial Note: equations of 2D lines  For a defined line equation:  Fixed  Insert point into equation …... Result is the distance (+ve or Result is +ve if -ve) of point from line given by: point on this side of line (i.e.> 0). Normal to line for: Result is -ve if point on this side of line. ( < 0 ) BMVA Summer School 2016 Machine Learning Extra : 29

Linear Separator Classification of example function  Instances (i.e, examples) { x i , y i } y = +1 f(x) = y = {+1, -1} x i = point in instance space (R n ) made i.e. 2 classes – up of n attributes y i =class value for classification of x i –  Want a linear separator. Can view this as constraint satisfaction problem: y = -1  Equivalently, N.B. we have a vector of weights coefficients ⃗ w BMVA Summer School 2016 Machine Learning Extra : 30

The bits the whirlwind tour left out ... BMVA Summer School 2016 - PowerPoint PPT Presentation

The bits the whirlwind tour left out ... BMVA Summer School 2016 extra background slides (from teaching material at Durham University) BMVA Summer School 2016 Machine Learning Extra : 1 Machine Learning Definition: A computer

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Indirect Left Turns Study Indirect Left Turns Study Indirect Left Turns Study Indirect Left

Redis 101 A whirlwind tour of the next big thing in NoSQL data storage P E T E R C O O P E R h

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

DAY TOURS 2016 TOUR OPTIONS SCHEDULED TOUR We require a minimum of two people to conduct a

2019 KR19 TOUR PRESENTATION Kalahari Sunset KR19 EURO TOUR PRESENTATION Kalahari Khoi-San

Bits and Bytes Topics Topics Why bits? Representing information as bits

Bits and Bytes Aug. 29, 2002 Topics Topics n Why bits? n Representing information as bits l

Semigroups of Left I-quotients Nassraddin Ghroda May 11, 2010 Semigroups of Left I-quotients

#prep X Assembly 02: Left Fan In this guide, we attach Left filament fan to the X carriage.

COMP 431 A Whirlwind Introduction to the Internet Internet Services & Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures & Techniques Vary

23 Patterns in 80 Minutes: a Whirlwind Java- centric Tour of the Gang-of-Four Design Patterns

Developing Apps With WatchKit A Whirlwind Tour Adam Shaw Kabuki Vision @KabukiVision

Help! Need Advice on Identifying Advice emnlp 2020 Venkata S Govindarajan 1 , Benjamin T Chen 2 ,

Dense matter QFT with the DS L=24 1e-08 LLR L=24 O( ) DS L=22 density-of-states DS L=20

Evaluation measures in NLP Zdenk abokrtsk 8th April 2020 NPFL124 Natural Language

INTO THE PIPELINE: THE LATEST IN PSYCHOPHARMACOLOGY Learning Objectives Describe the

Towards Focus Detection Introduction Project Background in Content Assessment Empirical Basis

Research-based interactive simulations to support quantum mechanics learning and teaching Antje

IAPT Providers Network 8 July 2020 Andy Wright, IAPT Clinical Advisor and Sarah Boul,

Coherence of f -Monotone Paths on Zonotopes. Robert Edman May 15, 2015 1 / 30 An Analogy: The

Sambuz

Useful Links

Newsletter

Mail Us

The bits the whirlwind tour left out ... BMVA Summer School 2016 - PowerPoint PPT Presentation

The bits the whirlwind tour left out ... BMVA Summer School 2016 extra background slides (from teaching material at Durham University) BMVA Summer School 2016 Machine Learning Extra : 1 Machine Learning Definition: A computer

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Indirect Left Turns Study Indirect Left Turns Study Indirect Left Turns Study Indirect Left

Redis 101 A whirlwind tour of the next big thing in NoSQL data storage P E T E R C O O P E R h

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel &amp; Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

DAY TOURS 2016 TOUR OPTIONS SCHEDULED TOUR We require a minimum of two people to conduct a

2019 KR19 TOUR PRESENTATION Kalahari Sunset KR19 EURO TOUR PRESENTATION Kalahari Khoi-San

Bits and Bytes Topics Topics Why bits? Representing information as bits

Bits and Bytes Aug. 29, 2002 Topics Topics n Why bits? n Representing information as bits l

Semigroups of Left I-quotients Nassraddin Ghroda May 11, 2010 Semigroups of Left I-quotients

#prep X Assembly 02: Left Fan In this guide, we attach Left filament fan to the X carriage.

COMP 431 A Whirlwind Introduction to the Internet Internet Services &amp; Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures &amp; Techniques Vary

23 Patterns in 80 Minutes: a Whirlwind Java- centric Tour of the Gang-of-Four Design Patterns

Developing Apps With WatchKit A Whirlwind Tour Adam Shaw Kabuki Vision @KabukiVision

Help! Need Advice on Identifying Advice emnlp 2020 Venkata S Govindarajan 1 , Benjamin T Chen 2 ,

Dense matter QFT with the DS L=24 1e-08 LLR L=24 O( ) DS L=22 density-of-states DS L=20

Evaluation measures in NLP Zdenk abokrtsk 8th April 2020 NPFL124 Natural Language

INTO THE PIPELINE: THE LATEST IN PSYCHOPHARMACOLOGY Learning Objectives Describe the

Towards Focus Detection Introduction Project Background in Content Assessment Empirical Basis

Research-based interactive simulations to support quantum mechanics learning and teaching Antje

IAPT Providers Network 8 July 2020 Andy Wright, IAPT Clinical Advisor and Sarah Boul,

Coherence of f -Monotone Paths on Zonotopes. Robert Edman May 15, 2015 1 / 30 An Analogy: The

Sambuz

Useful Links

Newsletter

Mail Us

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

COMP 431 A Whirlwind Introduction to the Internet Internet Services & Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures & Techniques Vary