AI Basics Heechul Yun Acknowledgement: Many slides are adopted from - - PowerPoint PPT Presentation

ai basics
SMART_READER_LITE
LIVE PREVIEW

AI Basics Heechul Yun Acknowledgement: Many slides are adopted from - - PowerPoint PPT Presentation

AI Basics Heechul Yun Acknowledgement: Many slides are adopted from Berkeleys CS188 AI slide deck. Markov Decision Process (MDP) In MDP, Markov means action outcomes depend only on the current state Andrey Markov (1856-1922)


slide-1
SLIDE 1

AI Basics

Heechul Yun

Acknowledgement: Many slides are adopted from Berkeley’s CS188 AI slide deck.

slide-2
SLIDE 2

Markov Decision Process (MDP)

  • In MDP, “Markov” means action outcomes

depend only on the current state

Andrey Markov (1856-1922)

slide-3
SLIDE 3

Markov Decision Process (MDP)

  • Markov decision process

– Set of states S – Start state s0 – Set of actions A – Transitions P(s’|s,a) (or T(s,a,s’)) – Rewards R(s,a,s’) (and discount )

  • Policy

– Choice of action for each state

  • Utility

– Sum of (discounted) rewards a s s, a s,a,s’ s’

slide-4
SLIDE 4

Q-Learning

  • Q-Learning: sample-based Q-value iteration
  • Learn Q(s,a) values as you go

– Receive a sample (s,a,s’,r) – Consider your old estimate: – Consider your new sample estimate: – Incorporate the new estimate into a running average:

slide-5
SLIDE 5

Q-learning

  • Problems

– Q(s,a) table grows exponentially – Cannot deal with complex problems

slide-6
SLIDE 6

A Neuron

Image credit: Lex Fridman, MIT

slide-7
SLIDE 7

Neural Network

  • Regular neural network
  • Convolutional neural network
slide-8
SLIDE 8

DQN

  • Q function is estimated with a neural network
  • Plus a few “tricks”

– Experience replay to improve robustness

slide-9
SLIDE 9

DQN Implementations

  • Plenty on the web
slide-10
SLIDE 10

OpenAI Gym

https://gym.openai.com/

slide-11
SLIDE 11

OpenAI Gym

  • Easy record & replay

[link] https://gym.openai.com/

slide-12
SLIDE 12

OpenAI Universe

https://universe.openai.com/

slide-13
SLIDE 13

Udacity Self-Driving Car Simulator

[link]

slide-14
SLIDE 14

Resources

  • OpenAI

– https://gym.openai.com/ – https://universe.openai.com/

  • Udacity self-driving car simul

– https://github.com/udacity/self-driving-car-sim

  • AI lectures for autonomous systems

– http://ai.berkeley.edu/ – http://rll.berkeley.edu/deeprlcourse/ – http://selfdrivingcars.mit.edu/ – David Silver, Tutorial: Deep Reinforcement Learning