monte carlo tree search for algorithm configuration mosaic
play

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC - PowerPoint PPT Presentation

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag TAU CNRS INRIA LRI Universit e Paris-Sud NeurIPS MetaLearning Wshop Dec. 8, 2018 1 / 14 Monte Carlo Tree Search for


  1. Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag TAU CNRS − INRIA − LRI − Universit´ e Paris-Sud NeurIPS MetaLearning Wshop − Dec. 8, 2018 1 / 14

  2. Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag Tackling the Underspecified CNRS − INRIA − LRI − Universit´ e Paris-Sud NeurIPS MetaLearning Wshop − Dec. 8, 2018 1 / 14

  3. AutoML: Algorithm Selection and Configuration A mixed optimization problem Find λ ∗ ∈ arg min λ ∈ Λ L ( λ, P ) with λ a pipeline and L the predictive loss on dataset P Modes ◮ offline hyper-parameter setting ◮ online hyper-parameter setting Approaches ◮ Bayesian optimization: SMAC, Auto-SkLearn, AutoWeka, BHOB Hutter et al., 11; Feurer et al. 15; Kotthoff et al. 17; Falkner et al. 18 ◮ Evolutionary Computation Olson et al. 16; Choromanski et al. 18 ◮ Bilevel optimization Franceschi et al. 17, 18 ◮ Reinforcement learning Andrychowicz 16; Drori et al. 18 2 / 14

  4. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree ◮ Building Blocks Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  5. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  6. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  7. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  8. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  9. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  10. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  11. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  12. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  13. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  14. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  15. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  16. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend