CS 473: Artificial Intelligence Conclusion Dan Weld University of - PDF document

CS 473: Artificial Intelligence Conclusion Dan Weld – University of Washington [Many of these slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Final Exam § Wed 8:30-10:20 § Closed book § One 8.5 x 11” sheet of paper notes allowed § No calculators 2 1

Studying § Practice exam & solutions on website § Review sessions § Today 10:30 – my office hour § Mon 1:30 – Gagan’s office hour § Tues – TBD § Use canvas for questions 3 Exam Topics Search Reinforcement Learning § § § Problem spaces § Exploration vs Exploitation § Model-based vs. model-free § BFS, DFS, UCS, A* (tree and graph), local search § Q-learning § Completeness and Optimality § Linear value function approx. § Heuristics: admissibility and consistency; pattern DBs Hidden Markov Models § CSPs § § Markov chains, DBNs § Constraint graphs, backtracking search § Forward algorithm § Forward checking, AC3 constraint propagation, ordering § Particle Filters heuristics Bayesian Networks § Games § § Basic definition, independence (d-sep) § Minimax, Alpha-beta pruning, § Variable elimination § Expectimax § Sampling (rejection, importance) § Evaluation Functions § Learning MDPs § § BN parameters with complete data § Bellman equations § Search thru space of BN structures § Value iteration, policy iteration § Expectation maximization § Beneficial AI 2

What is intelligence? § (bounded) Rationality § Agent has a performance measure to optimize § Given its state of knowledge § Choose optimal action § With limited computational resources § Human-like intelligence/behavior State-Space Search § X as a search problem § states, actions, transitions, cost, goal-test § Types of search § uninformed systematic: often slow § DFS, BFS, uniform-cost, iterative deepening § Heuristic-guided: better § Greedy best first, A* § Relaxation leads to heuristics § Local: fast, fewer guarantees; often local optimal § Hill climbing and variations § Simulated Annealing: global optimal § (Local) Beam Search 3

Which Algorithm? § A*, Manhattan Heuristic: Adversarial Search 4

Adversarial Search § AND/OR search space (max, min) § minimax objective function § minimax algorithm (~dfs) § alpha-beta pruning § Utility function for partial search § Learning utility functions by playing with itself § Openings/Endgame databases Knowledge Representation and Reasoning § Representing: what agent knows Propositional logic Constraint networks HMMs Bayesian networks … § Reasoning: what agent can infer Search Dynamic programming Preprocessing to simplify 5

Knowledge Representation and Reasoning { Propositional logic § Representing: what agent knows Constraint networks HMMs § Reasoning: what agent can infer Bayesian networks … Uncertainty Quantification Prop Logic Bayesian Networks Constraint Sat First-Order ? Logic Constraint Satisfaction Problems § Representation § Variables, Domains, Constraints § Reasoning: § Arc Consistency (k-Consistency) § Solving § Backtracking search: partial var assignments § Heuristics: min remaining values, min conflicts § Local search: complete var assignments 6

Trapped � § Pacman is trapped! He is surrounded by mysterious corridors, each � � of which leads to either a pit (P), a ghost(G), or an exit (E). In order to � � escape, he needs to figure out which corridors, if any, lead to an exit and freedom, rather than the certain doom of a pit or a ghost. � � § The one sign of what lies behind the corridors is the wind: a pit � � produces a strong breeze (S) and an exit produces a weak breeze � � (W), while a ghost doesn’t produce any breeze at all. Unfortunately, � Pacman cannot measure the strength of the breeze at a specific corridor. Instead, he can stand between two adjacent corridors and feel the max of the two breezes. For example, if he stands between a Variables? pit and an exit he will sense a strong (S) breeze, while if he stands between an exit and a ghost, he will sense a weak (W) breeze. The measurements for all intersections are shown in the figure below. § Also, while the total number of exits might be zero, one, or more, Pacman knows that two neighboring squares will not both be exits. 13 Trapped � § Pacman is trapped! He is surrounded by mysterious corridors, each � � of which leads to either a pit (P), a ghost(G), or an exit (E). In order to � � escape, he needs to figure out which corridors, if any, lead to an exit and freedom, rather than the certain doom of a pit or a ghost. � � § The one sign of what lies behind the corridors is the wind: a pit � � produces a strong breeze (S) and an exit produces a weak breeze � � (W), while a ghost doesn’t produce any breeze at all. Unfortunately, � Pacman cannot measure the strength of the breeze at a specific corridor. Instead, he can stand between two adjacent corridors and feel the max of the two breezes. For example, if he stands between a Variables? X 1 , … X 6 pit and an exit he will sense a strong (S) breeze, while if he stands Domains {P, G, E} between an exit and a ghost, he will sense a weak (W) breeze. The measurements for all intersections are shown in the figure below. § Also, while the total number of exits might be zero, one, or more, Pacman knows that two neighboring squares will not both be exits. 14 7

Trapped � § A pit produces a strong breeze (S) and an exit produces a weak � � breeze (W), while a ghost doesn’t produce any breeze at all. � � § Pacman feels the max of the two breezes. � � § the total number of exits might be zero, one, or more, § two neighboring squares will not both be exits. � � � � � Constraints? Variables? X 1 , … X 6 Domains {P, G, E} 15 Trapped � § A pit produces a strong breeze (S) and an exit produces a weak � � breeze (W), while a ghost doesn’t produce any breeze at all. � � § Pacman feels the max of the two breezes. � � § the total number of exits might be zero, one, or more, § two neighboring squares will not both be exits. � � � � � Constraints? X 1 = P or X 2 = P X 4 = P or X 5 = P ains of the variables that will be de X 2 = E or X 3 = E X 5 = P or X 6 = P P G E X 1 X 3 = E or X 4 = E X 6 = P or X 1 = P P G E X 2 X 3 P G E X i = E nand X i+1|7 = E Also! X 2 =/= P P G E X 4 X 3 =/= P X 5 P G E X 4 =/= P P G E X 6 16 8

Trapped � § A pit produces a strong breeze (S) and an exit produces a weak � � breeze (W), while a ghost doesn’t produce any breeze at all. � � § Pacman feels the max of the two breezes. � � § the total number of exits might be zero, one, or more, § two neighboring squares will not both be exits. � � � � � Constraints? Arc consistent? X 1 = P or X 2 = P X 4 = P or X 5 = P ains of the variables that will be de X 2 = E or X 3 = E X 5 = P or X 6 = P X 1 P G E X 3 = E or X 4 = E X 6 = P or X 1 = P P G E X 2 X 3 P G E X i = E nand X i+1|7 = E Also! X 2 =/= P P G E X 4 X 3 =/= P X 5 P G E X 4 =/= P P G E X 6 17 Trapped � § A pit produces a strong breeze (S) and an exit produces a weak � � breeze (W), while a ghost doesn’t produce any breeze at all. � � § Pacman feels the max of the two breezes. � � § the total number of exits might be zero, one, or more, § two neighboring squares will not both be exits. � � � � � Constraints? MRV heuristic? Arc consistent? X 1 = P or X 2 = P X 4 = P or X 5 = P ains of the variables that will be de X 2 = E or X 3 = E X 5 = P or X 6 = P X 1 P G E X 3 = E or X 4 = E X 6 = P or X 1 = P P G E X 2 X 3 P G E X i = E nand X i+1|7 = E Also! X 2 =/= P P G E X 4 X 3 =/= P X 5 P G E X 4 =/= P P G E X 6 18 9

KR&R: Markov Decision Process § Representation § states, actions, § probabilistic outcomes T ~ P(S’ | s, a) § Rewards § Reasoning: V*(s) § Value Iteration § dynamic programming generalization of expecti-max § Policy Iteration Bellman Equations Value Iteration Called a “Bellman Backup” § Forall s, Initialize V 0 (s) = 0 no time steps left means an expected reward of zero § Repeat do Bellman backups K += 1 } V k+1 (s) Q k+1 (s, a) = Σ s’ T(s, a, s’) [ R(s, a, s’) + γ V k (s’)] a } do ∀ s, a s, a V k+1 (s) = Max a Q k+1 (s, a) s,a,s’ V ( s’ ) § Repeat until |V k+1 (s) – V k (s) | < ε, forall s “convergence” k Successive approximation; dynamic programming 10

k=1 If agent is in 4,3, it only has one legal action: get jewel. It gets a reward and the game is over. If agent is in the pit, it has only one legal action, die. It gets a penalty and the game is over. Agent does NOT get a reward for moving INTO 4,3. Noise = 0.2 Discount = 0.9 Living reward = 0 k=2 0.8 (0 + 0.9*1) + 0.1 (0 + 0.9*0) + 0.1 (0 + 0.9*0) Noise = 0.2 Discount = 0.9 Living reward = 0 11

k=3 Noise = 0.2 Discount = 0.9 Living reward = 0 Policy Iteration § Let i =0 § Initialize π i (s) to random actions § Repeat § Step 1: Policy evaluation: § Initialize k=0; Forall s, V 0π (s) = 0 § Repeat until V π converges § For each state s, § Let k += 1 § Step 2: Policy improvement: § For each state, s, § If π i == π i+1 then it’s optimal; return it. § Else let i += 1 12

CS 473: Artificial Intelligence Conclusion Dan Weld University of - PDF document

CS 473: Artificial Intelligence Conclusion Dan Weld University of Washington [Many of these slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

Artificial Intelligence as Law Bart Verheij Department of Artificial Intelligence, Bernoulli

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Lecture Overview What is Artificial Intelligence? Agents acting in an environment

CSCI 446: Artificial Intelligence CSCI 446: Artificial Intelligence Course Website:

1.1 What is AI? 1. What is Artificial Intelligence? 2. AI Past and Present 3. Rational

8th November 2019 Artificial Intelligence Finance Institute NYU Courant Artificial Intelligence

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Introduction to Artificial Intelligence What is Artificial Intelligence for YOU? CPSC 533

CSE 473 Artificial Intelligence (AI) Rajesh Rao (Instructor) Yi-Shu Wei (TA) Hunter Whalen (TA)

Introduction to Artificial Intelligence ITK 340, Spring 2009 For Wednesday Read Russell and

a ; 90 o c -face centered lattice, having the

Syllabus for CMSC 722, AI Planning Dana S. Nau University of Maryland 2:06 PM January 25,

Foundations of Artificial Intelligence 0. Organizational Matters Malte Helmert and Thomas Keller

Probabilistic Graphical Models David Sontag New York University Lecture 1, January 31, 2013

Introduction to Artificial Intelligence CSCE 476-876, Fall 2017 URL: www.cse.unl.edu/~cse476 1

Making clinical AI and decision support a reality through adaptive user interfaces Malcolm Pradhan

On optimal FEM and impedance conditions for thin electromagnetic shielding sheets Kersten Schmidt