For Monday Read FOIL paper No homework Program 2 Questions? Rule - - PowerPoint PPT Presentation

for monday
SMART_READER_LITE
LIVE PREVIEW

For Monday Read FOIL paper No homework Program 2 Questions? Rule - - PowerPoint PPT Presentation

For Monday Read FOIL paper No homework Program 2 Questions? Rule Learning Why learn rules? Proposition Rule Learning Basic if-then rules Condition is typically a conjunction of attribute tests Basic Approaches


slide-1
SLIDE 1

For Monday

  • Read FOIL paper
  • No homework
slide-2
SLIDE 2

Program 2

  • Questions?
slide-3
SLIDE 3

Rule Learning

  • Why learn rules?
slide-4
SLIDE 4

Proposition Rule Learning

  • Basic if-then rules
  • Condition is typically a conjunction of

attribute tests

slide-5
SLIDE 5

Basic Approaches

  • Decision tree -> rules
  • Neural network -> rules (TREPAN)
  • Sequential covering algorithms

– Top-down – Bottom-up – Hybrid

slide-6
SLIDE 6

Decision Tree Rules

  • Resulting rules may contain unnecessary antecedents,

resulting in over-fitting.

  • Rules are post-pruned.
  • Resulting rules may lead to conflicting conclusions on

some instances.

  • Sort rules by training (validation) accuracy to create

an ordered decision list.

  • The first rule that applies is used to classify a test

instance.

red  circle → A (97% train accuracy) red  big → B (95% train accuracy) : : Test case: <big, red, circle> assigned to class A

slide-7
SLIDE 7

Sequential Covering

slide-8
SLIDE 8

Minimum Set Cover

slide-9
SLIDE 9

9

Greedy Sequential Covering Example

X Y

+ + + + + + + + + + + + +

slide-10
SLIDE 10

10

Greedy Sequential Covering Example

X Y

+ + + + + + + + + + + + +

slide-11
SLIDE 11

11

Greedy Sequential Covering Example

X Y

+ + + + + +

slide-12
SLIDE 12

12

Greedy Sequential Covering Example

X Y

+ + + + + +

slide-13
SLIDE 13

13

Greedy Sequential Covering Example

X Y

+ + +

slide-14
SLIDE 14

14

Greedy Sequential Covering Example

X Y

+ + +

slide-15
SLIDE 15

15

Greedy Sequential Covering Example

X Y

slide-16
SLIDE 16

16

No-optimal Covering Example

X Y

+ + + + + + + + + + + + +

slide-17
SLIDE 17

17

Greedy Sequential Covering Example

X Y

+ + + + + + + + + + + + +

slide-18
SLIDE 18

18

Greedy Sequential Covering Example

X Y

+ + + + + +

slide-19
SLIDE 19

19

Greedy Sequential Covering Example

X Y

+ + + + + +

slide-20
SLIDE 20

20

Greedy Sequential Covering Example

X Y

+ +

slide-21
SLIDE 21

21

Greedy Sequential Covering Example

X Y

+ +

slide-22
SLIDE 22

22

Greedy Sequential Covering Example

X Y

+

slide-23
SLIDE 23

23

Greedy Sequential Covering Example

X Y

+

slide-24
SLIDE 24

24

Greedy Sequential Covering Example

X Y

slide-25
SLIDE 25

Learning a Rule

  • Two basic approaches:

– Top-down – Bottom-up

slide-26
SLIDE 26

26

Top-Down Rule Learning Example

X Y

+ + + + + + + + + + + + +

slide-27
SLIDE 27

27

Top-Down Rule Learning Example

X Y

+ + + + + + + + + + + + +

Y>C1

slide-28
SLIDE 28

28

Top-Down Rule Learning Example

X Y

+ + + + + + + + + + + + +

Y>C1 X>C2

slide-29
SLIDE 29

29

Top-Down Rule Learning Example

X Y

+ + + + + + + + + + + + +

Y>C1 X>C2 Y<C3

slide-30
SLIDE 30

30

Top-Down Rule Learning Example

X Y

+ + + + + + + + + + + + +

Y>C1 X>C2 Y<C3 X<C4

slide-31
SLIDE 31

31

Bottom-Up Rule Learning Example

X Y

+

+ + + + + + + + + + + +

slide-32
SLIDE 32

32

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + + +

slide-33
SLIDE 33

33

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + + +

slide-34
SLIDE 34

34

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + +

+

+ + +

slide-35
SLIDE 35

35

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + +

+

+ + +

slide-36
SLIDE 36

36

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + + +

slide-37
SLIDE 37

37

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+

+ +

+ + + +

slide-38
SLIDE 38

38

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+

+ +

+ + + +

slide-39
SLIDE 39

39

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + + +

slide-40
SLIDE 40

40

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + +

+

slide-41
SLIDE 41

41

Bottom-Up Rule Learning Example

X Y

+

+ + + +

+

+ + + + + +

+

slide-42
SLIDE 42

Algorithm Specifics

  • Metrics

– How do we pick literals to add to our rules?

  • Handling continuous features
  • Pruning
slide-43
SLIDE 43

Rules vs. Trees

slide-44
SLIDE 44

Top-down vs Bottom-up

slide-45
SLIDE 45

45

Rule Learning vs. Knowledge Engineering

  • An influential experiment with an early rule-learning

method (AQ) by Michalski (1980) compared results to knowledge engineering (acquiring rules by interviewing experts).

  • People known for not being able to articulate their

knowledge well.

  • Knowledge engineered rules:

– Weights associated with each feature in a rule – Method for summing evidence similar to certainty factors. – No explicit disjunction

  • Data for induction:

– Examples of 15 soybean plant diseases descried using 35 nominal and discrete ordered features, 630 total examples. – 290 “best” (diverse) training examples selected for training. Remainder used for testing

  • What is wrong with this methodology?
slide-46
SLIDE 46

46

“Soft” Interpretation of Learned Rules

  • Certainty of match calculated for each category.
  • Scoring method:

– Literals: 1 if match, -1 if not – Terms (conjunctions in antecedent): Average of literal scores. – DNF (disjunction of rules): Probabilistic sum: c1 + c2 – c1*c2

  • Sample score for instance A  B  ¬C  D  ¬ E  F

A  B  C → P (1 + 1 + -1)/3 = 0.333 D  E  F → P (1 + -1 + 1)/3 = 0.333 Total score for P: 0.333 + 0.333 – 0.333* 0.333 = 0.555

  • Threshold of 0.8 certainty to include in possible diagnosis set.
slide-47
SLIDE 47

47

Experimental Results

  • Rule construction time:

– Human: 45 hours of expert consultation – AQ11: 4.5 minutes training on IBM 360/75

  • What doesn’t this account for?
  • Test Accuracy:

1st choice correct Some choice correct Number of diagnoses

AQ11

97.6% 100.0% 2.64

Manual KE

71.8% 96.9% 2.90