A quantile-based approach for hyperparameter transfer learning David - PowerPoint PPT Presentation

A quantile-based approach for hyperparameter transfer learning David Salinas 2 Huibin Shen 1 Valerio Perrone 1 1 Amazon Research 2 NAVER LABS Europe, work done at Amazon December 11, 2019 David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 1 / 8

Transfer learning setting i } n l Assume many HP evaluations { x l i , y l i =0 available for n l datasets i ∈ R d hyperparameter, y l x l i ∈ R objective to be minimized Can we use it to speed up the tuning of a new dataset? David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 2 / 8

Transfer learning Difficulties: Scales of objectives y l i may vary significantly across tasks Noise may not be Gaussian Many observations: hard to apply (approximate) GP dataset electricity exchange-rate 10 1 m4-Daily m4-Hourly m4-Monthly m4-Quarterly 10 0 value m4-Weekly m4-Yearly solar traffic 10 1 wiki-rolling 10 2 1.0 1.5 2.0 2.5 3.0 3.5 log number gradient update David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 3 / 8

Gaussian Copula transform If only every y l was Gaussian... Apply change of variable ψ = Φ − 1 ◦ F Φ Gaussian CDF, F is the marginal CDFs (approximated with empirical CDF) z l = ψ ( y l ) All z l becomes centered Gaussian! z l ∈ N (0 , 1) David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 4 / 8

Transfer learning Parametric Prior Regress z ( x ) ≈ N ( µ θ ( x ) , σ θ ( x )) Parameters θ are learned with MLE on evaluations Joint-learning as θ tied across tasks (only possible because z have comparable scales across tasks l) Two HPO strategies Thompson sampling with N ( µ θ ( x ) , σ θ ( x )) Gaussian Copula Process with the prior N ( µ θ ( x ) , σ θ ( x )) David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 5 / 8

Results Evaluate on 3 blackboxes with precomputed evaluations (MLP [Klein 18], DeepAR [Salinas 17], XGboost) blackbox # datasets # hyperparameters # evaluations objectives DeepAR 11 6 ∼ 220 quantile loss, time FCNET 4 9 62208 MSE, time XGBoost 9 9 5000 1-AUC David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 6 / 8

Results fcnet DeepAR 10 1 Normalized distance to the minimum Normalized distance to the minimum 10 3 10 2 10 3 10 4 10 4 RS RS GP GP 10 5 ABLR ABLR WS-best WS-best auto-range-gp auto-range-gp 10 6 10 5 CTS CTS GCP GCP 10 7 20 40 60 80 100 20 40 60 80 100 Iteration Iteration xgboost RS Normalized distance to the minimum GP 10 1 ABLR WS-best auto-range-gp CTS GCP 2 10 20 40 60 80 100 Iteration David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 7 / 8

Results Because every objectives are Gaussian centered, we can easily combined them! Multi-objective: optimize accuracy/time trade-off with z error ( x ) + z runtime ( x ) More at our poster! David Salinas, Huibin Shen, Valerio Perrone A quantile-based approach for hyperparameter transfer learning (Amazon Berlin) December 11, 2019 8 / 8

A quantile-based approach for hyperparameter transfer learning David - PowerPoint PPT Presentation

A quantile-based approach for hyperparameter transfer learning David Salinas 2 Huibin Shen 1 Valerio Perrone 1 1 Amazon Research 2 NAVER LABS Europe, work done at Amazon December 11, 2019 David Salinas, Huibin Shen, Valerio Perrone A

Hyperparameter tuning in caret Dr. Shirin Glander Data Scientist DataCamp Hyperparameter

Parameters vs hyperparameters Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning

Machine learning with H2O Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning in R

Machine learning with mlr Dr. Shirin Elsinghorst Data Scientist DataCamp Hyperparameter Tuning

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Generalized Quantile Regression in Stata Matthew Baker, Hunter College David Powell, RAND Travis

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Quantile Regression in R: For Fin and Fun Roger Koenker University of Illinois at

Applications of Normal Quantile Plots David Rose June 13, 2011 David Rose () Applications of

) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20 Quantile

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Symbolic Clustering Based on Quantile Representation Paula Brito Manabu Ichino Universidade do

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

Hyperparameter Search in Machine Learning Marc Claesen and Bart De Moor

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman

Stage Quantile regression by random projections Forecasting energy prices Involves

CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining

Tight Lower Bound for Comparison-Based Quantile Summaries Pavel Vesel y University of Warwick

Quantile Regression for Large-scale Applications Jiyan Yang Stanford University June 19, 2013

Time plots In applications like process control, detecting trends and other changes in a process

Use R for Climate Research Research & Teaching Hurricane Climatology with R James B. Elsner

B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 An overview of problems 2-1 Statistics

Sketching Streams Chris Taylor DoD Overview What-Why Sketch? Sketches Hyper Log Log

A quantile-based approach for hyperparameter transfer learning David - PowerPoint PPT Presentation

A quantile-based approach for hyperparameter transfer learning David Salinas 2 Huibin Shen 1 Valerio Perrone 1 1 Amazon Research 2 NAVER LABS Europe, work done at Amazon December 11, 2019 David Salinas, Huibin Shen, Valerio Perrone A

Hyperparameter tuning in caret Dr. Shirin Glander Data Scientist DataCamp Hyperparameter

Parameters vs hyperparameters Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning

Machine learning with H2O Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning in R

Machine learning with mlr Dr. Shirin Elsinghorst Data Scientist DataCamp Hyperparameter Tuning

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Generalized Quantile Regression in Stata Matthew Baker, Hunter College David Powell, RAND Travis

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Quantile Regression in R: For Fin and Fun Roger Koenker University of Illinois at

Applications of Normal Quantile Plots David Rose June 13, 2011 David Rose () Applications of

) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20 Quantile

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Symbolic Clustering Based on Quantile Representation Paula Brito Manabu Ichino Universidade do

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

Hyperparameter Search in Machine Learning Marc Claesen and Bart De Moor

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman

Stage Quantile regression by random projections Forecasting energy prices Involves

CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining

Tight Lower Bound for Comparison-Based Quantile Summaries Pavel Vesel y University of Warwick

Quantile Regression for Large-scale Applications Jiyan Yang Stanford University June 19, 2013

Time plots In applications like process control, detecting trends and other changes in a process

Use R for Climate Research Research &amp; Teaching Hurricane Climatology with R James B. Elsner

B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 An overview of problems 2-1 Statistics

Sketching Streams Chris Taylor DoD Overview What-Why Sketch? Sketches Hyper Log Log

Use R for Climate Research Research & Teaching Hurricane Climatology with R James B. Elsner