CSEP 573: Ar,ficial Intelligence Conclusion Luke Ze=lemoyer - - PowerPoint PPT Presentation

csep 573 ar ficial intelligence
SMART_READER_LITE
LIVE PREVIEW

CSEP 573: Ar,ficial Intelligence Conclusion Luke Ze=lemoyer - - PowerPoint PPT Presentation

CSEP 573: Ar,ficial Intelligence Conclusion Luke Ze=lemoyer University of Washington [Many of these slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at


slide-1
SLIDE 1

CSEP 573: Ar,ficial Intelligence

Conclusion

Luke Ze=lemoyer – University of Washington

[Many of these slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at h=p://ai.berkeley.edu.]

slide-2
SLIDE 2

CourseTopics

§ Search

§ Problem spaces § BFS, DFS, UCS, A* (tree and graph), local search § Completeness and Op,mality § Heuris,cs: admissibility and consistency; pa=ern DBs

§ Games

§ Minimax, Alpha-beta pruning, § Expec,max § Evalua,on Func,ons

§ MDPs

§ Bellman equa,ons § Value itera,on, policy itera,on

§ Reinforcement Learning

§ Explora,on vs Exploita,on § Model-based vs. model-free § Q-learning § Linear value func,on approx.

§ Hidden Markov Models

§ Markov chains, DBNs § Forward algorithm § Particle Filters

§ Bayesian Networks

§ Basic definition, independence (d-sep) § Variable elimination § Sampling (rejection, importance)

§ Learning

§ Naive Bayes § Perceptron § Neural Networks (not on exam)

slide-3
SLIDE 3

What is intelligence?

§ (bounded) Ra,onality

§ Agent has a performance measure to op,mize § Given its state of knowledge § Choose op,mal ac,on § With limited computa,onal resources

§ Human-like intelligence/behavior

slide-4
SLIDE 4

Search in Discrete State Spaces

§ Every discrete problem can be cast as a search problem.

§ states, ac,ons, transi,ons, cost, goal-test

§ Types

§ uninformed systema,c: ofen slow

§ DFS, BFS, uniform-cost, itera,ve deepening

§ Heuris,c-guided: be=er

§ Greedy best first, A* § relaxa,on leads to heuris,cs

§ Local: fast, fewer guarantees; ofen local op,mal

§ Hill climbing and varia,ons § Simulated Annealing: global op,mal

§ (Local) Beam Search

slide-5
SLIDE 5

Which Algorithm?

§ A*, Manhattan Heuristic:

slide-6
SLIDE 6

Constraint Sa,sfac,on Problems

§ Standard search problems:

§ State is a “black box”: arbitrary data structure § Goal test can be any func,on over states § Successor func,on can also be anything

§ Constraint sa,sfac,on problems (CSPs):

§ A special subset of search problems § State is defined by variables Xi with values from a domain D (some,mes D depends on i) § Goal test is a set of constraints specifying allowable combina,ons of values for subsets of variables

§ Making use of CSP formula,on allows for op,mized algorithms

§ Typical example of trading generality for u,lity (in this case, speed)

slide-7
SLIDE 7

Example: Sudoku

§ Variables: § Each (open) square § Domains: § {1,2,…,9} § Constraints:

9-way alldiff for each row 9-way alldiff for each column 9-way alldiff for each region (or can have a bunch of pairwise inequality constraints)

slide-8
SLIDE 8

Adversarial Search

slide-9
SLIDE 9

Adversarial Search

§ AND/OR search space (max, min) § minimax objec,ve func,on § minimax algorithm (~dfs)

§ alpha-beta pruning

§ U,lity func,on for par,al search

§ Learning u,lity func,ons by playing with itself

§ Openings/Endgame databases

slide-10
SLIDE 10

Big News Today!

slide-11
SLIDE 11

Markov Decision Processes

§ An MDP is defined by:

§ A set of states s ∈ S § A set of ac,ons a ∈ A § A transi,on func,on T(s, a, s’)

§ Probability that a from s leads to s’, i.e., P(s’| s, a) § Also called the model or the dynamics

§ A reward func,on R(s, a, s’)

§ Some,mes just R(s) or R(s’)

§ A start state § Maybe a terminal state

§ MDPs are non-determinis,c search problems

§ One way to solve them is with expec,max search § We’ll have new tools soon

[Demo – gridworld manual intro (L8D1)]

slide-12
SLIDE 12

The Bellman Equa,ons

§ Defini,on of “op,mal u,lity” via expec,max recurrence gives a simple one-step lookahead rela,onship amongst op,mal u,lity values § These are the Bellman equa,ons, and they characterize

  • p,mal values in a way we’ll use over and over

a s s, a s,a,s’ s’

(1920-1984)

slide-13
SLIDE 13

Par,ally Observable Markov Decision Processes

§ An MDP is defined by:

§ A set of states s ∈ S § A set of ac,ons a ∈ A § A set of observa,on o ∈ O § A transi,on func,on T(s, a, s’)

§ Probability that a from s leads to s’, i.e., P(s’| s, a) § Also called the dynamics

§ A observa,on func,on O(s, a, o)

§ Probability of observing o, i.e., P(o| s, a) § T and O together are ofen called the model

§ A reward func,on R(s, a, s’)

§ Some,mes just R(s) or R(s’)

§ A start state § Maybe a terminal state

slide-14
SLIDE 14

Pac-Man Beyond the Game!

slide-15
SLIDE 15

Pacman: Beyond Simula,on?

Students at Colorado University: h=p://pacman.elstonj.com

slide-16
SLIDE 16

Pacman: Beyond Simula,on!

[VIDEO: Roomba Pacman.mp4]

slide-17
SLIDE 17

KR&R: Probability

§ Representa,on: Bayesian Networks

§ encode probability distribu,ons compactly

§ by exploi,ng condi,onal independences

§ Reasoning

§ Exact inference: var elimina,on § Approx inference: sampling based methods

§ rejec,on sampling, likelihood weigh,ng, MCMC/Gibbs

Earthquake Burglary Alarm MaryCalls JohnCalls

slide-18
SLIDE 18

KR&R: Hidden Markov Models

§ Representa,on

§ Spl form of BN § Sequence model § One hidden state, one observa,on

§ Reasoning/Search

§ most likely state sequence: Viterbi algorithm § marginal prob of one state: forward-backward

slide-19
SLIDE 19

Learning Bayes Networks

§ We focused on Naïve Bayes and Perceptron, but you could also: § Learn Structure of Bayesian Networks

§ Search thru space of BN structures

§ Learn Parameters for a Bayesian Network

§ Fully observable variables

§ Maximum Likelihood (ML), MAP & Bayesian es,ma,on § Example: Naïve Bayes for text classifica,on

§ Hidden variables

§ Expecta,on Maximiza,on (EM)

slide-20
SLIDE 20

Bayesian Learning

Use Bayes rule:

Or equivalently: P(Y | X) ∝ P(X | Y) P(Y)

Prior Normalization Data Likelihood Posterior

P(Y | X) = P(X |Y) P(Y) P(X)

slide-21
SLIDE 21

Personal Robo,cs

slide-22
SLIDE 22

PR2 (autonomous)

[VIDEO: 5pile_200x.mp4] [Mai,n-Shepard, Cusumano- Towner, Lei, Abbeel, 2010]

slide-23
SLIDE 23

Autonomous tying of a knot for previously unseen situa,ons

[VIDEO: knots_appren,ce.mp4] [Schulman, Ho, Lee, Abbeel, 2013]

slide-24
SLIDE 24

Experiment: Suturing

[VIDEO: suturing-short-sped-up.mp4] [Schulman, Gupta, Venkatesan, Tayson-Frederick, Abbeel, 2013]

slide-25
SLIDE 25

Where to Go Next?

slide-26
SLIDE 26

That’s It!

§ Help us out with some course evalua,ons § Have a great string, and always maximize your expected u,li,es!

slide-27
SLIDE 27