1
20070607 Chap18 1
Chapter18
Learning from Observations
- Sec. 1 - 3
20070607 Chap18 2
Learning
- Essential for unknown environments.
i.e. when designer lacks omniscience
- Learning modifies the agent’s decision
Chapter18 Learning from Observations Sec. 1 - 3 20070607 - - PDF document
Chapter18 Learning from Observations Sec. 1 - 3 20070607 Chap18 1 Learning Essential for unknown environments. i.e. when designer lacks omniscience Learning modifies the agents decision mechanisms to improve performance
20070607 Chap18 1
20070607 Chap18 2
i.e. when designer lacks omniscience
20070607 Chap18 3
20070607 Chap18 4
An example is a pair: (x, f(x)) h: hypothesis function, f: target function
20070607 Chap18 5
20070607 Chap18 6
20070607 Chap18 7
20070607 Chap18 8
20070607 Chap18 9
20070607 Chap18 10
20070607 Chap18 11
Examples described by attribute values (Boolean, discrete, continuous, etc.) e.g. situations where I will/won’t wait for a table
20070607 Chap18 12
Problem: decide whether to wait for a table at a restaurant, based on the following attributes: 1. Alternate: is there an alternative restaurant nearby? 2. Bar: is there a comfortable bar area to wait in? 3. Fri/Sat: is today Friday or Saturday? 4. Hungry: are we hungry? 5. Patrons: number of people in the restaurant (None, Some, Full) 6. Price: price range ($, $$, $$$) 7. Raining: is it raining outside? 8. Reservation: have we made a reservation? 9. Type: kind of restaurant (French, Italian, Thai, Burger)
>60)
20070607 Chap18 13
20070607 Chap18 14
20070607 Chap18 15
n 2
20070607 Chap18 16
⇒ 3n distinct conjunctive hypotheses
⇒ may get worse predictions
20070607 Chap18 17
Aim: find a small tree consistent with the training example Idea: (recursively) choose “most significant” attribute as root of (sub)tree
20070607 Chap18 18
20070607 Chap18 19
2 2
2 1 1 i n i i n
=
20070607 Chap18 20
E1, … , Ev according to their values for A, where A has v distinct values.
attribute test:
=
v i i i i i i i i i
1
20070607 Chap18 21
bits )] 4 2 , 4 2 ( 12 4 ) 4 2 , 4 2 ( 12 4 ) 2 1 , 2 1 ( 12 2 ) 2 1 , 2 1 ( 12 2 [ 1 ) ( bits 0541 . )] 6 4 , 6 2 ( 12 6 ) , 1 ( 12 4 ) 1 , ( 12 2 [ 1 ) ( = + + + − = = + + − = I I I I Type IG I I I Patrons IG
20070607 Chap18 22
20070607 Chap18 23
1. Use theorems of computational/statistical learning theory 2. Try h on a new test set of examples (use same distribution over example space as training set)
20070607 Chap18 24