Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team - PowerPoint PPT Presentation

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA SequeL – INRIA Lille MVA-RL Course

A Bit of History: From Psychology to Machine Learning A Bit of History From Psychology to Machine Learning A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 2/14

A Bit of History: From Psychology to Machine Learning The law of effect [Thorndike, 1911] “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.” A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 3/14

A Bit of History: From Psychology to Machine Learning Experimental psychology ◮ Classical (human and) animal conditioning : “the magnitude and timing of the conditioned response changes as a result of the contingency between the conditioned stimulus and the unconditioned stimulus” [Pavlov, 1927]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 4/14

A Bit of History: From Psychology to Machine Learning Experimental psychology ◮ Classical (human and) animal conditioning : “the magnitude and timing of the conditioned response changes as a result of the contingency between the conditioned stimulus and the unconditioned stimulus” [Pavlov, 1927]. ◮ Operant conditioning (or instrumental conditioning) : process by which humans and animals learn to behave in such a way as to obtain rewards and avoid punishments [Skinner, 1938]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 4/14

A Bit of History: From Psychology to Machine Learning Experimental psychology ◮ Classical (human and) animal conditioning : “the magnitude and timing of the conditioned response changes as a result of the contingency between the conditioned stimulus and the unconditioned stimulus” [Pavlov, 1927]. ◮ Operant conditioning (or instrumental conditioning) : process by which humans and animals learn to behave in such a way as to obtain rewards and avoid punishments [Skinner, 1938]. Remark : reinforcement denotes any form of conditioning, either positive ( rewards ) or negative ( punishments ). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 4/14

A Bit of History: From Psychology to Machine Learning Computational neuroscience ◮ Hebbian learning : development of formal models of how the synaptic weights between neurons are reinforced by simultaneous activation. “Cells that fire together, wire together.” [Hebb, 1961]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 5/14

A Bit of History: From Psychology to Machine Learning Computational neuroscience ◮ Hebbian learning : development of formal models of how the synaptic weights between neurons are reinforced by simultaneous activation. “Cells that fire together, wire together.” [Hebb, 1961]. ◮ Emotions theory : model on how the emotional process can bias the decision process [Damasio, 1994]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 5/14

A Bit of History: From Psychology to Machine Learning Computational neuroscience ◮ Hebbian learning : development of formal models of how the synaptic weights between neurons are reinforced by simultaneous activation. “Cells that fire together, wire together.” [Hebb, 1961]. ◮ Emotions theory : model on how the emotional process can bias the decision process [Damasio, 1994]. ◮ Dopamine and basal ganglia model : direct link with motor control and decision-making (e.g., [Doya, 1999]). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 5/14

A Bit of History: From Psychology to Machine Learning Computational neuroscience ◮ Hebbian learning : development of formal models of how the synaptic weights between neurons are reinforced by simultaneous activation. “Cells that fire together, wire together.” [Hebb, 1961]. ◮ Emotions theory : model on how the emotional process can bias the decision process [Damasio, 1994]. ◮ Dopamine and basal ganglia model : direct link with motor control and decision-making (e.g., [Doya, 1999]). Remark : reinforcement denotes the effect of dopamine (and surprise). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 5/14

A Bit of History: From Psychology to Machine Learning Optimal control theory and dynamic programming ◮ Optimal control : formal framework to define optimization methods to derive control policies in continuous time control problems [Pontryagin and Neustadt, 1962]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 6/14

A Bit of History: From Psychology to Machine Learning Optimal control theory and dynamic programming ◮ Optimal control : formal framework to define optimization methods to derive control policies in continuous time control problems [Pontryagin and Neustadt, 1962]. ◮ Dynamic programming : set of methods used to solve control problems by decomposing them into subproblems so that the optimal solution to the global problem is the conjunction of the solutions to the subproblems [Bellman, 2003]. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 6/14

A Bit of History: From Psychology to Machine Learning Optimal control theory and dynamic programming ◮ Optimal control : formal framework to define optimization methods to derive control policies in continuous time control problems [Pontryagin and Neustadt, 1962]. ◮ Dynamic programming : set of methods used to solve control problems by decomposing them into subproblems so that the optimal solution to the global problem is the conjunction of the solutions to the subproblems [Bellman, 2003]. Remark : reinforcement denotes an objective function to maximize (or minimize). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 6/14

A Bit of History: From Psychology to Machine Learning Reinforcement learning Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal in an unknown uncertain environment. The learner is not told which actions to take, as in most forms of machine learning, but she must discover which actions yield the most reward by trying them ( trial–and–error ). In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards ( delayed reward ). “An introduction to reinforcement learning”, Sutton and Barto (1998). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 7/14

A Bit of History: From Psychology to Machine Learning Reinforcement learning Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal in an unknown uncertain environment. The learner is not told which actions to take, as in most forms of machine learning, but she must discover which actions yield the most reward by trying them ( trial–and–error ). In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards ( delayed reward ). “An introduction to reinforcement learning”, Sutton and Barto (1998). A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 8/14

A Bit of History: From Psychology to Machine Learning A Multi-disciplinary Field A.I. Statistics Clustering Statistical Learning Learning Theory Cognitives Sciences Neural Networks Applied Math Approximation Theory Reinforcement Learning Dynamic Programming Neuroscience Optimal Control Automatic Control Categorization Active Learning Psychology A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 9/14

A Bit of History: From Psychology to Machine Learning A Machine Learning Paradigm ◮ Supervised learning: an expert ( supervisor ) provides examples of the right strategy (e.g., classification of clinical images). Supervision is expensive. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 10/14

A Bit of History: From Psychology to Machine Learning A Machine Learning Paradigm ◮ Supervised learning: an expert ( supervisor ) provides examples of the right strategy (e.g., classification of clinical images). Supervision is expensive. ◮ Unsupervised learning: different objects are clustered together by similarity (e.g., clustering of images on the basis of their content). No actual performance is optimized. A. LAZARIC – Introduction to Reinforcement Learning Sept 29th, 2015 - 10/14

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team - PowerPoint PPT Presentation

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course A Bit of History: From Psychology to Machine Learning A Bit of History From Psychology to Machine

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

CSC2621 Topics in Robotics Reinforcement Learning in Robotics Week 11: Hierarchical Reinforcement

Machine Learning for NLP Reinforcement learning Aurlie Herbelot 2019 Centre for Mind/Brain

Communication Model David Woodruff IBM Almaden k-party Number-In-Hand Model P 1 x 1 -

Positive Reinforcement Training A Primer How Dogs (and People) Learn If a dog does something

The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore,

Logical Behaviorism vs. Behaviorism (in psychology) vs. Behavioral Psychology Reflex Theory

ALM: An R Package for Simulating Associative Learning Models Ching-Fan Sheu & Teng-Chang

Pavlovian, Skinner and other Intelligence. Desription Learning Behaviourism reign behaviourists

CSE/NEURO 528 Lecture 13: Reinforcement Learning & Course Review (Chapter 9) 1

CS449/649: Human-Computer Interaction Spring 2017 Lecture VI Anastasia Kuzminykh Translating

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team - PowerPoint PPT Presentation

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course A Bit of History: From Psychology to Machine Learning A Bit of History From Psychology to Machine

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

CSC2621 Topics in Robotics Reinforcement Learning in Robotics Week 11: Hierarchical Reinforcement

Machine Learning for NLP Reinforcement learning Aurlie Herbelot 2019 Centre for Mind/Brain

Communication Model David Woodruff IBM Almaden k-party Number-In-Hand Model P 1 x 1 -

Positive Reinforcement Training A Primer How Dogs (and People) Learn If a dog does something

The condensation threshold in stochastic block models Joe Neeman (with Jess Banks, Cris Moore,

Logical Behaviorism vs. Behaviorism (in psychology) vs. Behavioral Psychology Reflex Theory

ALM: An R Package for Simulating Associative Learning Models Ching-Fan Sheu &amp; Teng-Chang

Pavlovian, Skinner and other Intelligence. Desription Learning Behaviourism reign behaviourists

CSE/NEURO 528 Lecture 13: Reinforcement Learning &amp; Course Review (Chapter 9) 1

CS449/649: Human-Computer Interaction Spring 2017 Lecture VI Anastasia Kuzminykh Translating

ALM: An R Package for Simulating Associative Learning Models Ching-Fan Sheu & Teng-Chang

CSE/NEURO 528 Lecture 13: Reinforcement Learning & Course Review (Chapter 9) 1