Inductive Learning (1/2) An AI agent operating in a complex - PDF document

Motivation Inductive Learning (1/2) � An AI agent operating in a complex Decision Tree Method world requires an awful lot of knowledge: state representations, state (If it’s not simple, axioms, constraints, action descriptions, it’s not worth learning it) heuristics, probabilities, ... R&N: Chap. 18, Sect. 18.1–3 � More and more, AI agents are designed to acquire knowledge through learning 1 2 What is Learning? Contents � Mostly generalization from experience: � Introduction to inductive learning � Logic-based inductive learning: “Our experience of the world is specific, yet we are able to formulate general • Decision-tree induction theories that account for the past and � Function-based inductive learning predict the future” M.R. Genesereth and N.J. Nilsson, • Neural nets in Logical Foundations of AI , 1987 � � Concepts, heuristics, policies � Supervised vs. un-supervised learning 3 4 Rewarded Card Example Logic-Based Inductive Learning � Background knowledge KB � Deck of cards, with each card designated by [r,s], its rank and suit, and some cards “rewarded” � Background knowledge KB: � Training set D (observed knowledge) ((r=1) v … v (r=10)) ⇔ NUM(r) that is not logically implied by KB ((r=J) v (r=Q) v (r=K)) ⇔ FACE(r) ((s=S) v (s=C)) ⇔ BLACK(s) ((s=D) v (s=H)) ⇔ RED(s) � Inductive inference : � Training set D: Find h such that KB and h imply D REWARD([4,C]) ∧ REWARD([7,C]) ∧ REWARD([2,S]) ∧ ¬ REWARD([5,H]) ∧ ¬ REWARD([J,S]) h = D is a trivial, but un-interesting solution (data caching) 5 6 1

Learning a Predicate Rewarded Card Example (Concept Classifier) � Deck of cards, with each card designated by [r,s], its � Set E of objects (e.g., cards) rank and suit, and some cards “rewarded” � Goal predicate CONCEPT(x), where x is an object in E, � Background knowledge KB: that takes the value True or False (e.g., REWARD) ((r=1) v … v (r=10)) ⇔ NUM(r) Example: ((r=J) v (r=Q) v (r=K)) ⇔ FACE(r) CONCEPT describes the precondition of an action, e.g., ((s=S) v (s=C)) ⇔ BLACK(s) There are several possible Unstack(C,A) ((s=D) v (s=H)) ⇔ RED(s) inductive hypotheses � Training set D: • E is the set of states REWARD([4,C]) ∧ REWARD([7,C]) ∧ REWARD([2,S]) ∧ • CONCEPT(x) ⇔ ¬ REWARD([5,H]) ∧ ¬ REWARD([J,S]) HANDEMPTY ∈ x, BLOCK(C) ∈ x, BLOCK(A) ∈ x, � Possible inductive hypothesis: CLEAR(C) ∈ x, ON(C,A) ∈ x h ≡ (NUM(r) ∧ BLACK(s) ⇔ REWARD([r,s])) Learning CONCEPT is a step toward learning an action description 7 8 Learning a Predicate Example of Training Set (Concept Classifier) � Set E of objects (e.g., cards) � Goal predicate CONCEPT(x), where x is an object in E, that takes the value True or False (e.g., REWARD) � Observable predicates A(x), B(X), … (e.g., NUM, RED) � Training set: values of CONCEPT for some combinations of values of the observable predicates 9 10 Learning a Predicate Example of Training Set (Concept Classifier) � Set E of objects (e.g., cards) � Goal predicate CONCEPT(x), where x is an object in E, that takes the value True or False (e.g., REWARD) Ternary attributes � Observable predicates A(x), B(X), … (e.g., NUM, RED) � Training set: values of CONCEPT for some combinations of values of the observable predicates Goal predicate is PLAY-TENNIS � Find a representation of CONCEPT in the form: CONCEPT(x) ⇔ S(A,B, …) where S(A,B,…) is a sentence built with the observable predicates, e.g.: CONCEPT(x) ⇔ A(x) ∧ ( ¬ B(x) v C(x)) Note that the training set does not say whether an observable predicate is pertinent or not 11 12 2

Learning an Arch Classifier Example set � These objects are arches: � An example consists of the values of CONCEPT (positive examples) and the observable predicates for some object x � These aren’t: � An example is positive if CONCEPT is True, (negative examples) else it is negative � The set X of all examples is the example set ARCH(x) ⇔ HAS-PART(x,b1) ∧ HAS-PART(x,b2) ∧ � The training set is a subset of X HAS-PART(x,b3) ∧ IS-A(b1,BRICK) ∧ IS-A(b2,BRICK) ∧ ¬ MEET(b1,b2) ∧ (IS-A(b3,BRICK) v IS-A(b3,WEDGE)) ∧ a small one! SUPPORTED(b3,b1) ∧ SUPPORTED(b3,b2) 13 14 Hypothesis Space Inductive Learning Scheme Inductive � An hypothesis is any sentence of the form: hypothesis h Training set D CONCEPT(x) ⇔ S(A,B, …) where S(A,B,…) is a sentence built using the - observable predicates - + - + - � The set of all hypotheses is called the - - - + + + + hypothesis space H - - + + � An hypothesis h agrees with an example if it + + - - - gives the correct value of CONCEPT + + Hypothesis space H Example set X {[CONCEPT(x) ⇔ S(A,B, …)]} {[A, B, …, CONCEPT]} 15 16 Size of Hypothesis Space Multiple Inductive Hypotheses � Deck of cards, with each card designated by [r,s], its rank and suit, and some cards “rewarded” � Background knowledge KB: � n observable predicates ((r=1) v … v (r=10)) ⇔ NUM(r) � 2 n entries in truth table defining ((r=J) v (r=Q) v (r=K)) ⇔ FACE(r) ((s=S) v (s=C)) ⇔ BLACK(s) ((s=D) v (s=H)) ⇔ RED(s) CONCEPT and each entry can be filled � Training set D: REWARD([4,C]) ∧ REWARD([7,C]) ∧ REWARD([2,S]) ∧ with True or False ¬ REWARD([5,H]) ∧ ¬ REWARD([J,S]) � In the absence of any restriction h 1 ≡ NUM(r) ∧ BLACK(s) ⇔ REWARD([r,s]) 2 n (bias), there are hypotheses to h 2 ≡ BLACK(s) ∧ ¬ (r=J) ⇔ REWARD([r,s]) 2 choose from h 3 ≡ ([r,s]=[4,C]) ∨ ([r,s]=[7,C]) ∨ [r,s]=[2,S]) ⇔ REWARD([r,s]) � n = 6 � 2x10 19 hypotheses! h 4 ≡ ¬ ([r,s]=[5,H]) ∨ ¬ ([r,s]=[J,S]) ⇔ REWARD([r,s]) agree with all the examples in the training set 17 18 3

Notion of Capacity Multiple Inductive Hypotheses � Deck of cards, with each card designated by [r,s], its � It refers to the ability of a machine to learn any rank and suit, and some cards “rewarded” training set without error � Background knowledge KB: Need for a system of preferences – called ((r=1) v … v (r=10)) ⇔ NUM(r) � A machine with too much capacity is like a botanist a bias – to compare possible hypotheses ((r=J) v (r=Q) v (r=K)) ⇔ FACE(r) with photographic memory who, when presented with ((s=S) v (s=C)) ⇔ BLACK(s) ((s=D) v (s=H)) ⇔ RED(s) a new tree, concludes that it is not a tree because it � Training set D: has a different number of leaves from anything he REWARD([4,C]) ∧ REWARD([7,C]) ∧ REWARD([2,S]) ∧ ¬ REWARD([5,H]) ∧ ¬ REWARD([J,S]) has seen before h 1 ≡ NUM(r) ∧ BLACK(s) ⇔ REWARD([r,s]) � A machine with too little capacity is like the h 2 ≡ BLACK(s) ∧ ¬ (r=J) ⇔ REWARD([r,s]) botanist’s lazy brother, who declares that if it’s green, it’s a tree h 3 ≡ ([r,s]=[4,C]) ∨ ([r,s]=[7,C]) ∨ [r,s]=[2,S]) ⇔ REWARD([r,s]) � Good generalization can only be achieved when the right balance is struck between the accuracy attained h 4 ≡ ¬ ([r,s]=[5,H]) ∨ ¬ ([r,s]=[J,S]) ⇔ REWARD([r,s]) on the training set and the capacity of the machine agree with all the examples in the training set 19 20 � Keep-It-Simple (KIS) Bias � Keep-It-Simple (KIS) Bias � Examples � Examples • Use much fewer observable predicates than the • Use much fewer observable predicates than the training set training set • Constrain the learnt predicate, e.g., to use only “high- • Constrain the learnt predicate, e.g., to use only “high- level” observable predicates such as NUM, FACE, level” observable predicates such as NUM, FACE, BLACK, and RED and/or to have simple syntax BLACK, and RED and/or to have simple syntax Einstein: “A theory must be as simple as possible, � Motivation � Motivation but not simpler than this” • If an hypothesis is too complex it is not worth • If an hypothesis is too complex it is not worth learning it (data caching does the job as well) learning it (data caching does the job as well) • There are much fewer simple hypotheses than • There are much fewer simple hypotheses than complex ones, hence the hypothesis space is smaller complex ones, hence the hypothesis space is smaller 21 22 � Keep-It-Simple (KIS) Bias Putting Things Together � Examples • Use much fewer observable predicates than the yes Test Object set If the bias allows only sentences S that are Evaluation no training set set conjunctions of k << n predicates picked from Example • Constrain the learnt predicate, e.g., to use only “high- set X Goal predicate the n observable predicates, then the size of level” observable predicates such as NUM, FACE, Training Induced BLACK, and RED and/or to have simple syntax H is O(n k ) set D hypothesis h Observable predicates � Motivation Learning Hypothesis procedure L • If an hypothesis is too complex it is not worth space H learning it (data caching does the job as well) Bias • There are much fewer simple hypotheses than complex ones, hence the hypothesis space is smaller 23 24 4

Inductive Learning (1/2) An AI agent operating in a complex - PDF document

Motivation Inductive Learning (1/2) An AI agent operating in a complex Decision Tree Method world requires an awful lot of knowledge: state representations, state (If its not simple, axioms, constraints, action descriptions, its not

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

DMIP DMIP team DMIP DMIP team team team Data Mining and Inductive Data Mining and Inductive

Inductive types in Coq Wessel van Staal November 23, 2012 Inductive types Inductive nattree :

Inductive Types for Free Representing Nested Inductive Types using W-types Michael Abbott (U.

Interpreting inductive-inductive definitions as indexed inductive definitions Fredrik Nordvall

Inductive Theorem Proving Automated Reasoning Petros Papapanagiotou

Inductive Definitions with Inference Rules 1 / 25 Outline Introduction Specifying inductive

Inductive Programming A Unifying Framework for Analysis and Evaluation of Inductive Programming

Synthesis, Verification, and Synthesis, Verification, and Inductive Learning Inductive Learning

Inductive Learning and Ockhams Razor Konstantin Genin Kevin T. Kelly Carnegie Mellon

Inductive Learning of Answer Set Programs Mark Law, Alessandra Russo and Krysia Broda

Targeted Mailing Inductive Logic Programming Fabrizio Riguzzi University of Ferrara If

Inductive Bias: How to generalize on novel data CS 478 - Inductive Bias 1 Non-Linear Tasks l

A categorical semantics for inductive-inductive definitions Thorsten Altenkirch 2 Peter Morris 2

Inductive Definition and Structural Induction Class Notes To explain what are inductive

Inductive Analysis of the Internet Protocol TLS Lawrence C. Paulson Computer Laboratory

Model Order Reduction of Higher Order Systems Joint work with Peter Benner and Philip Saltenberger

Disclosures Research Support: Siemens Medical Systems, Svelte, PCI of Chronic Total

PNNI - Private Network to Network Interface Principles Topology concepts Routing

search for 8\h is 779 262 727 L 97 < 27 179 , 62 - Search is a generalization of BST search

Hebrews Hebrews 6:7, For 6:7, For the earth which drinks in the the earth which drinks in

Foundations of Artificial Intelligence 13. Machine Learning Learning from Observations Joschka

Outline Paper presentation Ultra-Portable Devices Introduction. Paper: Paper:

Momentum relation and classical limit in the future-not-included complex action theory

Inductive Learning (1/2) An AI agent operating in a complex - PDF document

Motivation Inductive Learning (1/2) An AI agent operating in a complex Decision Tree Method world requires an awful lot of knowledge: state representations, state (If its not simple, axioms, constraints, action descriptions, its not

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

DMIP DMIP team DMIP DMIP team team team Data Mining and Inductive Data Mining and Inductive

Inductive types in Coq Wessel van Staal November 23, 2012 Inductive types Inductive nattree :

Inductive Types for Free Representing Nested Inductive Types using W-types Michael Abbott (U.

Interpreting inductive-inductive definitions as indexed inductive definitions Fredrik Nordvall

Inductive Theorem Proving Automated Reasoning Petros Papapanagiotou

Inductive Definitions with Inference Rules 1 / 25 Outline Introduction Specifying inductive

Inductive Programming A Unifying Framework for Analysis and Evaluation of Inductive Programming

Synthesis, Verification, and Synthesis, Verification, and Inductive Learning Inductive Learning

Inductive Learning and Ockhams Razor Konstantin Genin Kevin T. Kelly Carnegie Mellon

Inductive Learning of Answer Set Programs Mark Law, Alessandra Russo and Krysia Broda

Targeted Mailing Inductive Logic Programming Fabrizio Riguzzi University of Ferrara If

Inductive Bias: How to generalize on novel data CS 478 - Inductive Bias 1 Non-Linear Tasks l

A categorical semantics for inductive-inductive definitions Thorsten Altenkirch 2 Peter Morris 2

Inductive Definition and Structural Induction Class Notes To explain what are inductive

Inductive Analysis of the Internet Protocol TLS Lawrence C. Paulson Computer Laboratory

Model Order Reduction of Higher Order Systems Joint work with Peter Benner and Philip Saltenberger

Disclosures Research Support: Siemens Medical Systems, Svelte, PCI of Chronic Total

PNNI - Private Network to Network Interface Principles Topology concepts Routing

search for 8\h is 779 262 727 L 97 &lt; 27 179 , 62 - Search is a generalization of BST search

Hebrews Hebrews 6:7, For 6:7, For the earth which drinks in the the earth which drinks in

Foundations of Artificial Intelligence 13. Machine Learning Learning from Observations Joschka

Outline Paper presentation Ultra-Portable Devices Introduction. Paper: Paper:

Momentum relation and classical limit in the future-not-included complex action theory

search for 8\h is 779 262 727 L 97 < 27 179 , 62 - Search is a generalization of BST search