[PDF] - Introduction to Artificial Intelligence Marc Toussaint March 29, PDF Document

SLIDE 1

Introduction to Artificial Intelligence

Marc Toussaint March 29, 2016

The majority of slides of the earlier parts are adapted from Stuart Russell. This is a direct concatenation and reformatting of all lecture slides and exercises from the Artificial Intelligence course (winter term 2014/15, U Stuttgart), including indexing to help prepare for exams.

n

trees CSP graphical models MDPs sequential decision problems search BFS propositional logic FOL relational graphical models relational MDPs Reinforcement Learning HMMs ML multi-agent MDPs MCTS utilities deterministic learning probabilistic propositional relational sequential decisions games bandits UCB constraint propagation belief propagation

msg. passing

Active Learning Decision Theory dynamic programming V(s), Q(s,a) fwd/bwd chaining backtracking fwd/bwd

msg. passing

FOL sequential assignment alpha/beta pruning minimax

1 Introduction 6 2 Search 9 2.1 Problem Formulation & Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Example: Romania . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Problem Definition: Deterministic, fully observable . . . . . . . . . . . . . . . . . . 9 2.2 Basic Tree Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Tree search implementation: states vs nodes . . . . . . . . . . . . . . . . . . . . . 10 Tree Search: General Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Breadth-first search (BFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Complexity of BFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Uniform-cost search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Depth-first search (DFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Complexity of DFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Iterative deepening search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Complexity of Iterative Deepening Search . . . . . . . . . . . . . . . . . . . . . . . 12

1

SLIDE 2

2 Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016

Graph search and repeated states . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Greedy and A∗Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Best-first Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Greedy Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Complexity of Greedy Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 A∗ search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 A∗: Proof 1 of Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Complexity of A∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 A∗: Proof 2 of Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Admissible heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Memory-bounded A∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Probabilities 17 3.1 Intro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Probabilities as (subjective) information calculus . . . . . . . . . . . . . . . . . . . 17 Inference: general meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Frequentist vs Bayesian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Definitions based on sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Probability distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Joint distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Marginal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Conditional distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Multiple RVs, conditional independence . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Bernoulli and Binomial distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Multinomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Dirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Conjugate priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4 Distributions over continuous domain . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Dirac distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Particle approximation of a distribution . . . . . . . . . . . . . . . . . . . . . . . . . 21 Utilities and Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Kullback-Leibler divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5 Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Rejection sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Student’s t, Exponential, Laplace, Chi-squared, Gamma distributions . . . . . . . . 23 4 Bandits, MCTS, & Games 24 4.1 Bandits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Multi-armed Bandits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Upper Confidence Bounds (UCB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Exploration, Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Upper Confidence Bound (UCB1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Monte Carlo Tree Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Monte Carlo Tree Search (MCTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Upper Confidence Tree (UCT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 Game Playing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Minimax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Alpha-Beta Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Evaluation functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 UCT for games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

SLIDE 3

Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016 3

5 Reinforcement Learning 30 5.1 Markov Decision Processes & Dynamic Programming . . . . . . . . . . . . . . . . 30 Markov Decision Process (MDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Value Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Bellman optimality equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Value Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Q-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Q-Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Proof of convergence of Q-Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.2 Learning in MDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Temporal difference (TD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Proof of convergence of Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Eligibility traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Model-based RL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3 Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Epsilon-greedy exploration in Q-learning . . . . . . . . . . . . . . . . . . . . . . . . 37 R-Max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Bayesian RL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Optimistic heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.4 Policy Search, Imitation, & Inverse RL* . . . . . . . . . . . . . . . . . . . . . . . . . 38 Policy gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Imitation Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Inverse RL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6 Constraint Satisfaction Problems 41 6.1 Problem Formulation & Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Constraint satisfaction problems (CSPs): Definition . . . . . . . . . . . . . . . . . . 41 Map-Coloring Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Methods for solving CSPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Variable order: Minimum remaining values . . . . . . . . . . . . . . . . . . . . . . . 43 Variable order: Degree heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Value order: Least constraining value . . . . . . . . . . . . . . . . . . . . . . . . . 43 Constraint propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Tree-structured CSPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7 Graphical Models 46 7.1 Bayes Nets and Conditional Independence . . . . . . . . . . . . . . . . . . . . . . 46 Bayesian Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Conditional independence in a Bayes Net . . . . . . . . . . . . . . . . . . . . . . . 47 Inference: general meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7.2 Inference Methods in Graphical Models . . . . . . . . . . . . . . . . . . . . . . . . 48 Inference in graphical models: overview . . . . . . . . . . . . . . . . . . . . . . . . 48 Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Gibbs sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Variable elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Factor graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Belief propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Message passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Loopy belief propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Junction tree algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Maximum a-posteriori (MAP) inference . . . . . . . . . . . . . . . . . . . . . . . . . 53 Conditional random field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

SLIDE 4

4 Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016

8 Dynamic Models 54 Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Hidden Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Filtering, Smoothing, Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 HMM: Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 HMM inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 9 Propositional Logic 57 9.1 Syntax & Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Knowledge base: Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Wumpus World example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Logics: Definition, Syntax, Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 58 Entailment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Propositional logic: Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Propositional logic: Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Logical equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Satisfiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 9.2 Inference Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Horn Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Modus Ponens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Forward chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Completeness of Forward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Backward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Conjunctive Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Conversion to CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 10 First-Order Logic 66 10.1 The FOL language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 FOL: Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Universal quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Existential quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 10.2 FOL Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Reduction to propositional inference . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Generalized Modus Ponens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Forward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Backward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Conversion to CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 11 Relational Probabilistic Modelling and Learning 73 11.1 STRIPS-like rules to model MDP transitions . . . . . . . . . . . . . . . . . . . . . . 73 Markov Decision Process (MDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 STRIPS rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Planning Domain Definition Language (PDDL) . . . . . . . . . . . . . . . . . . . . 73 Learning probabilistic rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Planning with probabilistic rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 11.2 Relational Graphical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Probabilistic Relational Models (PRMs) . . . . . . . . . . . . . . . . . . . . . . . . 75 Markov Logic Networks (MLNs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 The role of uncertainty in AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

SLIDE 5

Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016 5

12 Exercises 78 12.1 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 12.1.1 First Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 12.1.2 Programmieraufgabe: Tree Search (50%) . . . . . . . . . . . . . . . . . . . 78 12.1.3 Votieraufgabe: Spezialf¨ alle der Suchstrategien (25%) . . . . . . . . . . . . 79 12.1.4 Votieraufgabe: A∗-Suche (25%) . . . . . . . . . . . . . . . . . . . . . . . . 79 12.1.5 Pr¨ asenzaufgabe: Beispiel f¨ ur Baumsuche . . . . . . . . . . . . . . . . . . 79 12.2 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.2.1 Programmieraufgabe: Schach (75%) . . . . . . . . . . . . . . . . . . . . . . 79 12.2.2 Votieraufgabe: Bedingte Wahrscheinlichkeit (25%) . . . . . . . . . . . . . 80 12.2.3 Pr¨ asenzaufgabe: Bandits . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 12.3 Exercise 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 12.3.1 Votieraufgabe: Value Iteration (50%) . . . . . . . . . . . . . . . . . . . . . 80 12.3.2 Programmieraufgabe: Value Iteration (50%) . . . . . . . . . . . . . . . . . 81 12.3.3 Pr¨ asenzaufgabe: Eligibilities in TD-learning . . . . . . . . . . . . . . . . . 81 12.4 Exercise 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 12.4.1 Programmieraufgabe: Constrained Satisfaction Problems (75%) . . . . . . 81 12.4.2 Votieraufgabe: CSP (25%) . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 12.4.3 Pr¨ asenzaufgabe: Generalized Arc Consistency . . . . . . . . . . . . . . . 82 12.5 Exercise 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 12.5.1 Programmieraufgabe: Spamfilter mit Naive Bayes (50%) . . . . . . . . . . 82 12.5.2 Votieraufgabe: Hidden Markov Modelle (50%) . . . . . . . . . . . . . . . . 83 12.6 Exercise 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 12.6.1 Erf¨ ullbarkeit und allgemeine G¨ ultigkeit (Aussagenlogik) (30%) . . . . . . . . 84 12.6.2 Modelle enumerieren (Aussagenlogik) (30%) . . . . . . . . . . . . . . . . . 84 12.6.3 Unifikation (Pr¨ adikatenlogik) (40%) . . . . . . . . . . . . . . . . . . . . . . . 84 12.6.4 Pr¨ asenzaufgabe: Matching as Constraint Satisfaction Problem . . . . . . . 84 12.6.5 Pr¨ asenzaufgabe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Index 86

SLIDE 6

6 Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016

1 Introduction

The rise of AI (again)

– The singularity – Ethics – The value problem – The outrageous inability of humans to define what is “good” – Paper clips

1:1

What’s the route to AI?

– Neuroscience? (EU Big Brain project) – Deep Learning? (Pure Machine Learning?, DeepMind (Lon- don)) – Social/Emotional/conciousnes/Feelings stuff? – Hardcore classical AI? Modern probabilistic/learning AI? – Robotics?

Why is there no university department for Intelligence Re-

search?!

1:2

Potted history of AI

1943 McCulloch & Pitts: Boolean circuit model of brain 1950 Turing’s “Computing Machinery and Intelligence” 1952–69 Look, Ma, no hands! 1950s Early AI programs, including Samuel’s checkers program, Newell & Simon’s Logic Theorist, Gelernter’s Geometry Engine 1956 Dartmouth meeting: “Artificial Intelligence” adopted 1965 Robinson’s complete algorithm for logical reasoning 1966–74 AI discovers computational complexity Neural network research almost disappears 1969–79 Early development of knowledge-based systems 1980–88 Expert systems industry booms 1988–93 Expert systems industry busts: “AI Winter” 1985–95 Neural networks return to popularity 1988– Resurgence of probability; general increase in technical depth “Nouvelle AI”: ALife, GAs, soft computing 1995– Agents, agents, everywhere . . . 2003– Human-level AI back on the agenda 1:3

What is intelligence?

Maybe it is easier to first ask what systems we actually

talk about:

– Decision making – Problem solving – Interacting with an environment

Then define objectives

1:4

Interactive domains

We assume the agent is in interaction with a domain.

– The world is in a state st ∈ S (see below on what that means) – The agent senses observations yt ∈ O – The agent decides on an action at ∈ A – The world transitions to a new state st+1

The observation yt describes all information received by

the agent (sensors, also rewards, feedback, etc) if not explicitly stated otherwise

(The technical term for this is a POMDP)

1:5

State

The notion of state is often used imprecisely
At any time t, we assume the world is in a state st ∈ S
st is a state description of a domain iff future observations

yt+, t+ > t are conditionally independent of all history

bservations yt−, t− < t given st and future actions at:t+:

agent

s0 s1 a0 s2 a1 s3 a2 a3 y0 y1 y2 y3

Notes:

– Intuitively, st describes everything about the world that is “relevant” – Worlds do not have additional latent (hidden) variables to the state st

1:6

Examples

What is a sufficient definition of state of a computer that

you interact with?

What is a sufficient definition of state for a thermostat

scenario? (First, assume the ’room’ is an isolated chamber.)

What is a sufficient definition of state in an autonomous

car case?

SLIDE 7

Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016 7

→ in real worlds, the exact state is practically not repre- sentable → all models of domains will have to make approximating assumptions (e.g., about independencies)

1:7

How can agents be formally described?

...or, what formal classes of agents do exist?

Basic alternative agent models:

– The agent maps yt → at (stimulus-response mapping.. non-optimal) – The agent stores all previous observations and maps f : y0:t, a0:t-1 → at f is called agent function. This is the most general model, including the others as special cases. – The agent stores only the recent history and maps yt−k:t, at−k:t-1 → at (crude, but may be a good heuristic) – The agent is some machine with its own internal state nt, e.g., a computer, a finite state machine, a brain... The agent maps (nt-1, yt) → nt (internal state update) and nt → at – The agent maintains a full probability distribution (belief) bt(st) over the state, maps (bt-1, yt) → bt (Bayesian belief update), and bt → at

1:8

POMDP coupled to a state machine agent

agent

s0 s1 s2 r1 r0 r2 a2 y2 a1 y1 a0 y0 n0 n1 n2

Is this a very limiting agent model?

1:9

Multi-agent domain models

(The technical term for this is a Decentralized POMDPs)

(from Kumar et al., IJCAI 2011)

This is a special type (simplification) of a general DEC-

POMDP

Generally, this level of description is very general, but

NEXP-hard Approximate methods can yield very good results, though

1:10

We gave above very general and powerful models (for-

malizations) of what it means that an agent takes deci- sions in an interactive environment. There are many flavors of this:

– Fully observable vs. partially observable – Single agent vs. multiagent – Deterministic vs. stochastic – Structure of the state space: Discrete, continuous, hybrid; factored; relational – Discrete vs. continuous time

Next we need to decide on Objectives

1:11

Objectives, optimal agents, & rationality

Utilities, rewards, etc
An optimal (=rational) agent is one that takes decisions

to maximize the (expected) objective

1:12

n

trees CSP graphical models MDPs sequential decision problems search BFS propositional logic FOL relational graphical models relational MDPs Reinforcement Learning HMMs ML multi-agent MDPs MCTS utilities deterministic learning probabilistic propositional relational sequential decisions games bandits UCB constraint propagation belief propagation

msg. passing

Active Learning Decision Theory dynamic programming V(s), Q(s,a) fwd/bwd chaining backtracking fwd/bwd

msg. passing

FOL sequential assignment alpha/beta pruning minimax

1:13

Organisation

1:14

Vorlesungen der Abteilung MLR

Bachelor:

– Grundlagen der K¨ unstlichen Intelligenz (3+1 SWS)

Master:

SLIDE 8

8 Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016

– Vertiefungslinie Intelligente Systeme (gemeinsam mit An- dres Bruhn) – WS: Maths for Intelligent Systems – WS: Introduction to Robotics – SS: Machine Learning – SS: Optimization – manchmal: Reinforcement Learning (Vien Ngo), Advanced Robotics – Hauptseminare: Machine Learning (WS), Robotics (SS)

1:15

Andres Bruhn’s Vorlesungen in der Vertiefungslinie

– WS: Computer Vision – SS: Correspondence Problems in Computer Vision – Hauptseminar: Recent Advances in Computer Vision

1:16

Vorraussetzungen f¨ ur die KI Vorlesung

Mathematik f¨

ur Informatiker und Softwaretechniker

außerdem hilfreich:

– Algorithmen und Datenstrukturen – Theoretische Informatik

1:17

Vorlesungsmaterial

Webseite zur Vorlesung:

https://ipvs.informatik.uni-stuttgart.de/mlr/ marc/teaching/ die Folien und ¨ Ubungsaufgaben werden dort online gestellt

Alle Materialien des letzten Jahres sind online – bitte

machen Sie sich einen Eindruck

Hauptliteratur:

Stuart Russell & Peter Norvig: Artificial Intelligence – A Modern Approach

– Many slides are adopted from Stuart

1:18

Pr¨ ufung

Schriftliche Pr¨

ufung, 60 Minuten

Termin zentral organisiert
keine Hilfsmittel erlaubt
Anmeldung: Im LSF / beim Pr¨

ufungsamt

Pr¨

ufungszulassung: 50% der Punkte der ¨ Ubungsaufgaben

1:19

¨ Ubungen

8 ¨

Ubungsgruppen (4 Tutoren)

2 Arten von Aufgaben: Coding- und Votier- ¨

Ubungen

Coding-Aufgaben: Teams von bis zu 3 Studenten geben

die Coding-Aufgaben zusammen ab

Votier-Aufgaben:

– Zu Beginn der ¨ Ubung eintragen, welche Aufgaben bear- beiten wurden/pr¨ asentiert werden k¨

nnen

– Zuf¨ allige Auswahl

Schein-Kriterium: 50% aller Aufgaben gel¨
st (Coding und

Voting).

Registrierung

http://uebungsgruppen.informatik.uni-stuttgart.de/

– username: KI, passwd: 10110 – ALS VORNAMEN ANGEBEN: ”Vorname TEAMNAME”, zB ”Peter frogs”

1:20

SLIDE 9

Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016 9

2 Search

Motivation & Outline

Search algorithms are core tool for decision making, especially when the domain is too complex to use alternatives like Dynamic Program- ming. In recent years, with the increase in computational power search methods became the method of choice for complex domains, like the game of Go, or certain POMDPs. Learning about search tree algorithms is an important background for several reasons:

The concept of decision trees, which represent the space of

possible future decisions and state transitions, is generally important for thinking about decision problems.

In probabilistic domains, tree search algorithms are a special

case of Monte-Carlo methods to estimate some expectation, typically the so-called Q-function. The respective Monte-Carlo Tree Search algorithms are the state-of-the-art in many domains.

Tree search is also the background for backtracking in CSPs

as well as forward and backward search in logic domains. We will cover the basic tree search methods (breadth, depth, iterative deepening) and eventually A∗

Outline

Problem formulation & examples
Basic search algorithms

2:1

2.1 Problem Formulation & Examples Example: Romania

On holiday in Romania; currently in Arad. Flight leaves tomorrow from Bucharest Formulate goal: be in Bucharest, Sgoal = {Bucharest} Formulate problem: states: various cities, S = {Arad, Timisoara, . . . } actions: drive between cities, A = {edges between states} Find solution: sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest minimize costs with cost function, (s, a) → c

2:3

Example: Romania

2:4

Problem types

Deterministic, fully observable (“single-state problem”) Agent knows exactly which state it will be in; solution is a sequence First state and world known → the agent does not rely on observations Non-observable (“conformant problem”) Agent may have no idea where it is; solution (if any) is a sequence Nondeterministic and/or partially observable (“contingency problem”) percepts provide new information about current state solution is a reactive plan or a policy

ften interleave search, execution

Unknown state space (“exploration problem”)

2:5

Deterministic, fully observable problem def.

A deterministic, fully observable problem is defined by four items: initial state s0 ∈ S e.g., s0 = Arad successor function succ : S × A → S e.g., succ(Arad,Arad-Zerind) = Zerind goal states Sgoal ⊆ S e.g., s = Bucharest step cost function cost(s, a, s′), assumed to be ≥ 0 e.g., traveled distance, number of actions executed, etc. the path cost is the sum of step costs A solution is a sequence of actions leading from s0 to a goal An optimal solution is a solution with minimal path costs

2:6

Example: vacuum world state space graph

SLIDE 10

10 Introduction to Artificial Intelligence, Marc Toussaint—March 29, 2016

states??: integer dirt and robot locations (ignore dirt amounts etc.) actions??: Left, Right, Suck, NoOp goal test??: no dirt path cost??: 1 per action (0 for NoOp)

2:7

Example: The 8-puzzle

states??: integer locations of tiles (ignore intermediate positions) actions??: move blank left, right, up, down (ignore un- jamming etc.) goal test??: = goal state (given) path cost??: 1 per move [Note: optimal solution of n-Puzzle family is NP-hard]

2:8