Curiosity-Bottleneck: Exploration by Distilling Task-Specific - PowerPoint PPT Presentation

Curiosity-Bottleneck: Exploration by Distilling Task-Specific Novelty Youngjin Kim 1 4 , Wontae Nam 3 , Hyunwoo Kim 1 Jihoon Kim 2 and Gunhee Kim 1 2 1 3 4 Code available at: http://vision.snu.ac.kr/projects/cb

Motivation: Exploration under Distraction (a) Known Place (b) Known Place and Strangers Navigating City 1. Distractive Environments are Widespread Real-world observations often contain § novel but task-irrelevant information.

Motivation: Exploration under Distraction Not Novel Novel (a) Known Place (b) Known Place and Strangers Navigating City 2. Degeneration of Prior Novelty-Based Exploration Strategies Due to task-agnostic intrinsic reward § Need mechanisms to prioritize task-related novelty §

Approach: Curiosity-Bottleneck % $ Intrinsic Reward " Compressor Value Predictor & # $ ! " ' " External Reward E E " Environment Policy Environment Quantify the ‘Degree of Compression’ using a compressive value network

Approach: Curiosity-Bottleneck % $ Intrinsic Reward " Value Predictor Compressor & # $ ! " ' " External Reward E E " Environment Policy Environment Compressor Encode rare ! to a lengthy code and common ! to a shorter code § Discard information about ! during compression §

Approach: Curiosity-Bottleneck % $ Intrinsic Reward " Value Predictor Compressor & # $ ! " ' " External Reward E E " Environment Policy Environment Value Predictor Prevent Compressor from discarding task-related information §

Approach: Curiosity-Bottleneck 1. Objective Function Minimize average code-length of representation ! § Discard information about observation " § #+, -(!) − - ! " Preserve information related to value estimate ) § #$% &(!; )) / = −& !; ) + 2& "; ! 2. Intrinsic Reward: Per-instance Mutual Information 7 8 % log 7 %, 8 3 4 (%) = 5 7 % 7(8) =8 6

Approach: Curiosity-Bottleneck : ! ",$ 9 = + −log. $ (/ = |0 = ) 3![4 " (5|6 = )||.(5)] Value Predictor Compressor 6 = 0 = ∼ 4 " (5|6 = ) ? $ , @ $ ? " , @ " 3. Approximation Variational Information Bottleneck with Gaussian assumptions ! ",$ = & ',( [− log . $ / 0 + 23![4 " 5 6 | . 5 ] 9 : (6) = 3![4 " 5 6 ||. 5 ]

Experiments: Static Environment Detects novelty ! " ( ) while being robust to distraction ! # ( ) Random Box 0.1 ! " 0.9 0.1 Object ! " 0.9 0.1 Pixel Noise ! " 0.9 ! # ! # ! # ! # ! # 0.1 0.9 0.1 0.9 0.1 0.9 0.1 0.9 0.1 0.9 (a) Input (b) Ideal (c) CB (d) CB-noKL (e) RND (f) SimHash

Experiments: Treasure-Hunt Grad-Cam Visualization The adaptive exploration strategy (a) Input (c) CB (e) RND (f) Dynamics (g) SimHash (b) CB-Early (d) CB-noKL !"[$ % & ' ||) & ] Compression loss term induces task-agnostic exploration in early stages

Experiments: Treasure-Hunt Grad-Cam Visualization The adaptive exploration strategy (a) Input (c) CB (e) RND (f) Dynamics (g) SimHash (b) CB-Early (d) CB-noKL − "#$ % & ' ( Value prediction loss term induces task-specific exploration after collecting external rewards

Experiments: Treasure-Hunt Consistently outperform baselines on different distraction settings SimHash Dynamics CB-noKL RND CB Mean Episodic Reward 1e6 1e6 (a) Movement Condition (b) Location Condition

Experiments: Atari Hard-Exploration Games SimHash Dynamics CB-noKL RND CB With Distraction W.o. Distraction Gravitar Montezuma Solaris

Curiosity-Bottleneck : Exploration by Distilling Task-Specific Novelty Thank You! Poster @ Pacific Ballroom #48 Code Available at http://vision.snu.ac.kr/projects/cb

Curiosity-Bottleneck: Exploration by Distilling Task-Specific - PowerPoint PPT Presentation

Curiosity-Bottleneck: Exploration by Distilling Task-Specific Novelty Youngjin Kim 1 4 , Wontae Nam 3 , Hyunwoo Kim 1 Jihoon Kim 2 and Gunhee Kim 1 2 1 3 4 Code available at: http://vision.snu.ac.kr/projects/cb Motivation: Exploration under

Brewing and Distilling BSc Brewing and Distilling @ Heriot-Watt? International Centre for

the early modern era Research is feeding curiosity and answering questions The Guardian 14

Cabinets of Curiosity What are Cabinets of Curiosity? Background Context -Renaissance -The

CUSTOMER CURIOSITY EXPERIENCE People stop and look at things that pique their curiosity every

Most awarded craft distillery in North America in 2014, 2015 & 2016 by the American Distilling

Water Reduction Phillips Distilling Company Nathaniel Scherer Project Advisor: Michelle Gage

Cloud Adoption in the Enterprise Distilling Facts from the Hype Steve Wylie, General Manager,

Tell Them Apart: Distilling Technology Differences from Crow-Scale Comparison Discussions Huang,

von Neumann's bottleneck von Neumann machine One control unit that connects memory and

Advance Space Exploration : Mars Science Laboratory/Curiosity J. Douglas McCuistion Director,

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell

Exploration Task Force meeting Task Force Purpose Statement Maintaining UKCS Exploration and

Bond Task Force Draft Bond Task Force Recommendations Tuesday, February 27 , 2018 Bond Task

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

p wered Yva productivity AI Task Manager @nerdybff Task Management Task Management Todoist

The Curiosity Cycle Jonathan Mugan, @jmugan Tech2025:

High-Stakes: Standardized Testing, Teacher's Work and Urban Schools in the US and Canada Arlo

Social Preferences and Parental Influence in Preschoolers Avner Ben-Ner, University of Minnesota

Models of Language Evolution Iterated learning Michael Franke Facets of EvoLang

Psycholinguistics Lecture 3 By Dr. Chelli Lecture Objectives Students will review: Language

10/10/2018 Envir ironments s and nd Attit itudes that hat su supp pport rt Curio

Formalizing your Curiosity Wisconsin Public Library Consor2um (WPLC) Workshop 1 of 4 Joshua H.

You Won't Believe This! commonsense.org/education Shareable with attribution for noncommercial

Exploration: Part 2 CS 285: Deep Reinforcement Learning, Decision Making, and Control Sergey

Sambuz

Useful Links

Newsletter

Mail Us

Curiosity-Bottleneck: Exploration by Distilling Task-Specific - PowerPoint PPT Presentation

Curiosity-Bottleneck: Exploration by Distilling Task-Specific Novelty Youngjin Kim 1 4 , Wontae Nam 3 , Hyunwoo Kim 1 Jihoon Kim 2 and Gunhee Kim 1 2 1 3 4 Code available at: http://vision.snu.ac.kr/projects/cb Motivation: Exploration under

Brewing and Distilling BSc Brewing and Distilling @ Heriot-Watt? International Centre for

the early modern era Research is feeding curiosity and answering questions The Guardian 14

Cabinets of Curiosity What are Cabinets of Curiosity? Background Context -Renaissance -The

CUSTOMER CURIOSITY EXPERIENCE People stop and look at things that pique their curiosity every

Most awarded craft distillery in North America in 2014, 2015 &amp; 2016 by the American Distilling

Water Reduction Phillips Distilling Company Nathaniel Scherer Project Advisor: Michelle Gage

Cloud Adoption in the Enterprise Distilling Facts from the Hype Steve Wylie, General Manager,

Tell Them Apart: Distilling Technology Differences from Crow-Scale Comparison Discussions Huang,

von Neumann's bottleneck von Neumann machine One control unit that connects memory and

Advance Space Exploration : Mars Science Laboratory/Curiosity J. Douglas McCuistion Director,

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell

Exploration Task Force meeting Task Force Purpose Statement Maintaining UKCS Exploration and

Bond Task Force Draft Bond Task Force Recommendations Tuesday, February 27 , 2018 Bond Task

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

p wered Yva productivity AI Task Manager @nerdybff Task Management Task Management Todoist

The Curiosity Cycle Jonathan Mugan, @jmugan Tech2025:

High-Stakes: Standardized Testing, Teacher's Work and Urban Schools in the US and Canada Arlo

Social Preferences and Parental Influence in Preschoolers Avner Ben-Ner, University of Minnesota

Models of Language Evolution Iterated learning Michael Franke Facets of EvoLang

Psycholinguistics Lecture 3 By Dr. Chelli Lecture Objectives Students will review: Language

10/10/2018 Envir ironments s and nd Attit itudes that hat su supp pport rt Curio

Formalizing your Curiosity Wisconsin Public Library Consor2um (WPLC) Workshop 1 of 4 Joshua H.

You Won't Believe This! commonsense.org/education Shareable with attribution for noncommercial

Exploration: Part 2 CS 285: Deep Reinforcement Learning, Decision Making, and Control Sergey

Sambuz

Useful Links

Newsletter

Mail Us

Most awarded craft distillery in North America in 2014, 2015 & 2016 by the American Distilling