Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: - PDF document

Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: SELF-ORGANISING FEATURE MAPS (KOHONEN) • Topographic maps • Network architecture • The SOFM algorithm (on-line) • The SOFM algorithm (batch) • Applications • Properties of the SOFM • Use in reinforcement learning 1

TOPOGRAPHIC MAPS • Topographic = topology preserving – neighbouring places in the world are found in neighbouring places in the map • Topographic maps are found in biological systems, e.g. the retino- topic map from the retina to visual cortex, the somatosensory map and tonotopic map • In the visual cortex, adjacent neruons have adjacent visual receptive fields, and collectively they constitute a map of the retina • SOFM (or SOM) developed by Kohonen since 1982 • Builds on ideas of Willshaw and von der Malsburg about how retino- topic maps are wired up 2

CLUSTERING • Self -organising map similar to a clustering algorithm, except that there is an additional constraint • Cluster centres are embedded in another space (the output space) and points that are nearby in the input space must map to points that are nearby in the output space • So when we update a cluster centre, we also update its neighbours. • Has the effect of keeping close-by units in the ouput space mapping to close-by regions of input space 3

KOHONEN ARCHITECTURE 2−d array of output or map units only a few interlayer connections shown Input units • Input has dimension d , i.e. d units • Array – usually 1D or 2D – of grid/map/output units on a rectan- gular or hexagonal grid • Each input unit is connected to each grid unit • Neighbourhood relations calculated on this grid • An example input could be ( x, y, ˙ x, ˙ y, θ ) for position, velocity and orientation of a robot • Each grid unit j has a vector w j associated with it, of the same dimension as the input 4

THE SOM ALGORITHM Initialise a grid of units to have weight vectors w j set to random values Loop until weights change by only tiny amounts Take a sample input x Find the winning map node i ∗ that best matches the input: i ∗ = arg min j � x − w j � Update the winning weight vector and the weights of those nodes in its neighbourhood: w j ( t + 1) = w j ( t ) + η ( t ) N t ( j, i ∗ )( x − w j ( t )) The learning rate η ( t ) needs to decrease during the learning, as does the width of the neighbourhood function N t ( j, i ∗ ) . We start with N having a wide range and narrow it down, and η starts large and is successively reduced to zero. 5

NEIGHBOURHOOD FUNCTIONS 1 0 −k k distance from winning node in map grid units top−hat Gaussian 1 0.8 0.6 0.4 0.2 0 −2 −1 0 1 2 6

SOM BATCH ALGORITHM Luttrell (1990), Kohonen (1993) Initialise the grid of units to have weight vectors w j set to random values Loop until terminated • for k = 1 to K (number of data vectors) find the best matching (winning) unit i ∗ ( k ) = arg min j � x k − w j � end for • Update the weight vectors using k =1 x k N ( i ∗ ( k ) ,j ) � K w j = � K k =1 N ( i ∗ ( k ) ,j ) End loop 7

PRACTICAL ISSUES • Grid: –dimension? –size? –topology? • Typical training regimes: –Sort out gross structure in early iterations –Fine structure later • Preprocessing of input signals + scaling PROPERTIES OF THE SOM After convergence, the map will have the properties: • Topological Ordering (as far as possible given the topology of the output space) • Density Matching There will be more units in high-density regions of the input space 8

APPLICATIONS • Many! Thousands of SOM papers • Phonetic typewriter (early application by Kohonen) – Convert short (about 10ms) slices of sound to 15 frequency bands + volume – Train network on 16D vectors – Label network with phoneme names – Rule-based post-processing improves recognition accuracy • Robot map-making 9

Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: - PDF document

Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: SELF-ORGANISING FEATURE MAPS (KOHONEN) Topographic maps Network architecture The SOFM algorithm (on-line) The SOFM algorithm (batch) Applications Properties of the

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Does Grosjeans Language Mode require Variable Language Activation? T. Mark Ellison &

Henkel Q2 2018 Hans Van Bylen, Carsten Knobel Dsseldorf, August 16, 2018 Commented Slides /

Network analysis and visualization for social media Andreas Kaltenbrunner Social Media Research

SECONDARY 1 REGISTRATION 23 December 2019 CONTENTS OF REGISTRATION PACKAGE 1) PRINCIPALS

Se lf Or ga niz ing Fe a t ur e M a ps Presented by: Mike Huang Igor Djuric Steve Park CPSC

MUSICAL PITCH I Work on the RQ linked from the course webpage! YU / LAMONT JANUARY 30, 2018 2

Irish Paediatric Early Warning System (PEWS) Learning Outcomes By the end of the session, you

Asking the Source: Why are safety-net dental services often underutilized? Community Partners

Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: - PDF document

Reinforcement Learning 12 March 2007 Lecture 19 SOFMs: SELF-ORGANISING FEATURE MAPS (KOHONEN) Topographic maps Network architecture The SOFM algorithm (on-line) The SOFM algorithm (batch) Applications Properties of the

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Does Grosjeans Language Mode require Variable Language Activation? T. Mark Ellison &amp;

Henkel Q2 2018 Hans Van Bylen, Carsten Knobel Dsseldorf, August 16, 2018 Commented Slides /

Network analysis and visualization for social media Andreas Kaltenbrunner Social Media Research

SECONDARY 1 REGISTRATION 23 December 2019 CONTENTS OF REGISTRATION PACKAGE 1) PRINCIPALS

Se lf Or ga niz ing Fe a t ur e M a ps Presented by: Mike Huang Igor Djuric Steve Park CPSC

MUSICAL PITCH I Work on the RQ linked from the course webpage! YU / LAMONT JANUARY 30, 2018 2

Irish Paediatric Early Warning System (PEWS) Learning Outcomes By the end of the session, you

Asking the Source: Why are safety-net dental services often underutilized? Community Partners

Does Grosjeans Language Mode require Variable Language Activation? T. Mark Ellison &