CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - PowerPoint PPT Presentation

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The University of Texas at Austin

Good Afternoon, Colleagues Are there any questions? Patrick MacAlpine

Good Afternoon, Colleagues Are there any questions? • How is SMDP different from MDP? • Advantages of tile coding vs other approaches? • What about SPAR (Strategic Position by Attraction and Repulsion)? Patrick MacAlpine

Logistics • Progress reports due at beginning of class today Patrick MacAlpine

Logistics • Progress reports due at beginning of class today • Progress report peer review due next Thursday – reports to review will be sent out shortly Patrick MacAlpine

Logistics • Progress reports due at beginning of class today • Progress report peer review due next Thursday – reports to review will be sent out shortly • Prize for winning class tournament Patrick MacAlpine

Logistics • Progress reports due at beginning of class today • Progress report peer review due next Thursday – reports to review will be sent out shortly • Prize for winning class tournament • 10+ students went to Undergraduate Writing Center :) Patrick MacAlpine

Reinforcement Learning Image from wikipedia Patrick MacAlpine

Reinforcement Learning Image from wikipedia Markov Decision Process (MDP) Patrick MacAlpine

Reinforcement Learning Image from wikipedia Markov Decision Process (MDP) Important questions: • What is your state space? Patrick MacAlpine

Reinforcement Learning Image from wikipedia Markov Decision Process (MDP) Important questions: • What is your state space? • What is your action space? Patrick MacAlpine

Reinforcement Learning Image from wikipedia Markov Decision Process (MDP) Important questions: • What is your state space? • What is your action space? • What is your reward function? Patrick MacAlpine

SARSA (s t ,a t ,r t ,s t +1 ,a t +1 ) Image from wikipedia Patrick MacAlpine

SARSA (s t ,a t ,r t ,s t +1 ,a t +1 ) Image from wikipedia Learn Q table (value function) for state - action pairs Q ( s t , a t ) ← Q ( s t , a t ) + α [ r t +1 + γQ ( s t +1 , a t +1 ) − Q ( s t , a t )] Patrick MacAlpine

Keepaway • Keepaway videos Patrick MacAlpine

Keepaway • Keepaway videos • Slides Patrick MacAlpine

Keepaway Discussion • Could you use learned policies for full soccer game? Patrick MacAlpine

Keepaway Discussion • Could you use learned policies for full soccer game? • Could we apply competitve co-evolution? Patrick MacAlpine

Keepaway Discussion • Could you use learned policies for full soccer game? • Could we apply competitve co-evolution? • Other sub-tasks in soccer that might be learnable? Patrick MacAlpine

Half Field Offense <Slides> Patrick MacAlpine

Policy Search vs Value Function Based RL Policy Search Value Function Based Learn Policy parameters Value function Good For Tuning parameter values Learning discrete actions Evaluation Fitness function Reward function Algorithms CMA-ES, genetic algorithms, etc. SARSA, Q-learning, etc. Patrick MacAlpine

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - PowerPoint PPT Presentation

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The University of Texas at Austin Good Afternoon, Colleagues Are there any questions? Patrick MacAlpine Good Afternoon, Colleagues Are there any

CHAPTER 11: MULTIAGENT INTERACTIONS An Introduction to Multiagent Systems

CHAPTER 6: MULTIAGENT INTERACTIONS An Introduction to Multiagent Systems

LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to Multiagent Systems

LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department or Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department or Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department or Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department or Computer Science The

CS344M Autonomous Multiagent Systems Todd Hester Department or Computer Science The University

CS344M Autonomous Multiagent Systems Todd Hester Department or Computer Science The University

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The

Do wnlo ad a c o py o f the Family Pe e r Ad vo c ate Pro fe ssio nal Cre d e ntial so yo u

C OMMUNITY E NGAGEMENT F RAMEWORK FOR D EVELOPMENT OF E DUCATION /T RAINING FOR R ESEARCHERS

Rubrics for Research Papers Lisa Johnson-Shull lisaj@wsu.edu Undergraduate Writing Center CUE

CS 134: Operating Systems I/O Hardware 1 / 15 Overview CS34 Overview 2013-05-17 Patch Peer

Critical Question Can we scale down the technology? ! Can we reach the required voltage? ! How

Functional Programming Functional Programming and Theorem Proving and Theorem Proving for

Deconvolution with ADMM EE367/CS448I: Computational Imaging and Display stanford.edu/class/ee367

0.85 PEF with AC-coupled Inverter-Stacking for Noise Efficiency Enhancement Somok Mondal and Drew