cs489 698 lecture 22 march 27 2017
play

CS489/698 Lecture 22: March 27, 2017 Bagging and Distributed - PowerPoint PPT Presentation

CS489/698 Lecture 22: March 27, 2017 Bagging and Distributed Computing [RN] Sec. 18.10, [M] Sec. 16.2.5, [B] Chap. 14, [HTF] Chap 15-16, [D] Chap. 11 CS489/698 (c) 2017 P. Poupart 1 Boosting vs Bagging Review CS489/698 (c) 2017 P.


  1. CS489/698 Lecture 22: March 27, 2017 Bagging and Distributed Computing [RN] Sec. 18.10, [M] Sec. 16.2.5, [B] Chap. 14, [HTF] Chap 15-16, [D] Chap. 11 CS489/698 (c) 2017 P. Poupart 1

  2. Boosting vs Bagging • Review CS489/698 (c) 2017 P. Poupart 2

  3. Independent classifiers/predictors • How can we obtain independent classifiers/predictors for bagging? • Bootstrap sampling – Sample (without replacement) subset of data • Random projection – Sample (without replacement) subset of features • Learn different classifiers/predictors based on each data subset and feature subset CS489/698 (c) 2017 P. Poupart 3

  4. Bagging For k = 1 to K sample data subset sample feature subset train classifier/predictor based on and Classification: Regression: Random forest: bag of decision trees CS489/698 (c) 2017 P. Poupart 4

  5. Application: Xbox 360 Kinect • Microsoft Cambridge • Body part recognition: supervised learning 5 CS489/698 (c) 2017 P. Poupart

  6. Depth camera • Kinect Gray scale depth map Infrared image 6 CS489/698 (c) 2017 P. Poupart

  7. Kinect Body Part Recognition • Problem: label each pixel with a body part 7 CS489/698 (c) 2017 P. Poupart

  8. Kinect Body Part Recognition • Features: depth differences between pairs of pixels • Classification: forest of decision trees 8 CS489/698 (c) 2017 P. Poupart

  9. Large Scale Machine Learning • Big data – Large number of data instances – Large number of features • Solution: distribute computation (parallel computation) – GPU (Graphics Processing Unit) – Many cores CS489/698 (c) 2017 P. Poupart 9

  10. GPU computation • Many Machine Learning algorithms consist of vector, matrix and tensor operations – A tensor is a multidimensional array • GPU (Graphics Processing Units) can perform arithmetic operations on all elements of a tensor in parallel • Packages that facilitate ML programming on GPUs: TensorFlow, Theano, Torch, Caffe, DL4J CS489/698 (c) 2017 P. Poupart 10

  11. Multicore Computation • Idea: Train a different classifier/predictor with a subset of the data on each core • How can we combine the classifiers/predictors? • Should we take the average of the parameters of the classifiers/predictors? No, this might lead to a worse classifier/predictor. This is especially problematic for models with hidden variables/units such as neural networks and hidden Markov models CS489/698 (c) 2017 P. Poupart 11

  12. Bad case of parameter averaging • Consider two threshold neural networks that encode the exclusive-or Boolean function • Averaging the weights yields a new neural network that does not encode exclusive-or CS489/698 (c) 2017 P. Poupart 12

  13. Safely Combining Predictions • A safe approach to ensemble learning is to combine the predictions (not the parameters) • Classification: majority vote of the classes predicted by the classifiers • Regression: average of the predictions computed by the regressors CS489/698 (c) 2017 P. Poupart 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend