Outline Wh y Mac hine Learning? What is a w ell-dened - - PDF document

outline wh y mac hine learning what is a w ell de ned
SMART_READER_LITE
LIVE PREVIEW

Outline Wh y Mac hine Learning? What is a w ell-dened - - PDF document

Outline Wh y Mac hine Learning? What is a w ell-dened learning problem? An example: learning to pla y c hec k ers What questions should w e ask ab out Mac hine Learning? 1 lecture slides


slide-1
SLIDE 1 Outline
  • Wh
y Mac hine Learning?
  • What
is a w ell-dened learning problem?
  • An
example: learning to pla y c hec k ers
  • What
questions should w e ask ab
  • ut
Mac hine Learning? 1 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-2
SLIDE 2 Wh y Mac hine Learning
  • Recen
t progress in algorithms and theory
  • Gro
wing
  • d
  • f
  • nline
data
  • Computational
p
  • w
er is a v ailable
  • Budding
industry Three nic hes for mac hine learning:
  • Data
mining : using historical data to impro v e decisions { medical records ! medical kno wledge
  • Soft
w are applications w e can't program b y hand { autonomous driving { sp eec h recognition
  • Self
customizing programs { Newsreader that learns user in terests 2 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-3
SLIDE 3 T ypical Datamining T ask Data:

Patient103 Patient103 Patient103 ...

time=1 time=2 time=n Age: 23 FirstPregnancy: no Anemia: no Diabetes: no PreviousPrematureBirth: no ... Elective C−Section: ? Emergency C−Section: ? Age: 23 FirstPregnancy: no Anemia: no PreviousPrematureBirth: no Diabetes: YES ... Emergency C−Section: ? Ultrasound: abnormal Elective C−Section: no Age: 23 FirstPregnancy: no Anemia: no PreviousPrematureBirth: no ... Elective C−Section: no Ultrasound: ? Diabetes: no

Emergency C−Section: Yes

Ultrasound: ?

Giv en:
  • 9714
patien t records, eac h describing a pregnancy and birth
  • Eac
h patien t record con tains 215 features Learn to predict:
  • Classes
  • f
future patien ts at high risk for Emergency Cesarean Section 3 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-4
SLIDE 4 Datamining Result Data:

Patient103 Patient103 Patient103 ...

time=1 time=2 time=n Age: 23 FirstPregnancy: no Anemia: no Diabetes: no PreviousPrematureBirth: no ... Elective C−Section: ? Emergency C−Section: ? Age: 23 FirstPregnancy: no Anemia: no PreviousPrematureBirth: no Diabetes: YES ... Emergency C−Section: ? Ultrasound: abnormal Elective C−Section: no Age: 23 FirstPregnancy: no Anemia: no PreviousPrematureBirth: no ... Elective C−Section: no Ultrasound: ? Diabetes: no

Emergency C−Section: Yes

Ultrasound: ?

One
  • f
18 learned rules: If No previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission Then Probability
  • f
Emergency C-Section is 0.6 Over training data: 26/41 = .63, Over test data: 12/20 = .60 4 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-5
SLIDE 5 Credit Risk Analysis Data:

Customer103: Customer103: Customer103:

(time=t0) (time=t1) (time=tn)

...

... Own House: Yes Other delinquent accts: 2 Loan balance: $2,400 Income: $52k Max billing cycles late: 3 Years of credit: 9 Profitable customer?: ? ... Own House: Yes Years of credit: 9 Profitable customer?: ? ... Own House: Yes Years of credit: 9 Loan balance: $3,250 Income: ? Other delinquent accts: 2 Max billing cycles late: 4 Loan balance: $4,500 Income: ? Other delinquent accts: 3 Max billing cycles late: 6

Profitable customer?: No

Rules learned from syn thesized data: If Other-Delinquent-A ccoun ts > 2, and Number-Delinquent- Billi ng-Cy cles > 1 Then Profitable-Custome r? = No [Deny Credit Card application] If Other-Delinquent-A ccoun ts = 0, and (Income > $30k) OR (Years-of-Credit > 3) Then Profitable-Custome r? = Yes [Accept Credit Card application] 5 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-6
SLIDE 6 Other Prediction Problems Customer purc hase b eha vior:

Customer103: Customer103: Customer103:

(time=t0) (time=t1) (time=tn)

...

... Sex: M Age: 53 Income: $50k Own House: Yes MS Products: Word Computer: 386 PC Purchase Excel?: ? ... Sex: M Age: 53 Income: $50k Own House: Yes MS Products: Word ... Sex: M Age: 53 Income: $50k Own House: Yes Purchase Excel?: ? MS Products: Word Computer: Pentium Computer: Pentium

Purchase Excel?: Yes

Customer reten tion:

Customer103: Customer103:

Age: 53 Age: 53 Age: 53 Sex: M Sex: M Sex: M

Customer103:

(time=t0) (time=t1) (time=tn)

...

Income: $50k Income: $50k Income: $50k Own House: Yes Own House: Yes Own House: Yes Checking: $5k Checking: $20k Checking: $0 Savings: $15k Savings: $0 Savings: $0 ... ... Current−customer?: yes

Current−customer?: No

Current−customer?: yes

Pro cess
  • ptimization:

(time=t0) (time=t1) (time=tn)

... Product72: Product72: Product72:

... Viscosity: 1.3 ... ... Viscosity: 1.3 Product underweight?: ??

Product underweight?:

Viscosity: 3.2

Yes

Fat content: 15% Stage: mix Mixing−speed: 60rpm Density: 1.1 Stage: cook Temperature: 325 Fat content: 12% Density: 1.2 Stage: cool Fan−speed: medium Fat content: 12% Spectral peak: 3200 Density: 2.8 Spectral peak: 2800 Spectral peak: 3100 Product underweight?: ??

6 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-7
SLIDE 7 Problems T
  • Dicult
to Program b y Hand AL VINN [P
  • merleau]
driv es 70 mph
  • n
high w a ys

Sharp Left Sharp Right

4 Hidden Units 30 Output Units 30x32 Sensor Input Retina

Straight Ahead

7 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-8
SLIDE 8 Soft w are that Customizes to User h ttp://www.wisewi re.com 8 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-9
SLIDE 9 Where Is this Headed? T
  • da
y: tip
  • f
the iceb erg
  • First-generation
algorithms: neural nets, decision trees, regression ...
  • Applied
to w ell-formated database
  • Budding
industry Opp
  • rtunit
y for tomorro w: enormous impact
  • Learn
across full mixed-media data
  • Learn
across m ultiple in ternal databases, plus the w eb and newsfeeds
  • Learn
b y activ e exp erimen tation
  • Learn
decisions rather than predictions
  • Cum
ulativ e, lifel
  • ng
learning
  • Programm
ing languages with learning em b edded? 9 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-10
SLIDE 10 Relev an t Discipli nes
  • Articial
in telli gence
  • Ba
y esian metho ds
  • Computational
complexit y theory
  • Con
trol theory
  • Information
theory
  • Philosoph
y
  • Psyc
hology and neurobiology
  • Statistics
  • :
: : 10 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-11
SLIDE 11 What is the Learning Problem? Learning = Impro ving with exp erience at some task
  • Impro
v e
  • v
er task T ,
  • with
resp ect to p erformance measure P ,
  • based
  • n
exp erience E . E.g., Learn to pla y c hec k ers
  • T
: Pla y c hec k ers
  • P
: %
  • f
games w
  • n
in w
  • rld
tournamen t
  • E
:
  • pp
  • rtunit
y to pla y against self 11 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-12
SLIDE 12 Learning to Pla y Chec k ers
  • T
: Pla y c hec k ers
  • P
: P ercen t
  • f
games w
  • n
in w
  • rld
tournamen t
  • What
exp erience?
  • What
exactly should b e learned?
  • Ho
w shall it b e represen ted?
  • What
sp ecic algorithm to learn it? 12 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-13
SLIDE 13 T yp e
  • f
T raining Exp erience
  • Direct
  • r
indirect?
  • T
eac her
  • r
not? A problem: is training exp erience represen tativ e
  • f
p erformance goal? 13 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-14
SLIDE 14 Cho
  • se
the T arget F unction
  • C
hooseM
  • v
e : B
  • ar
d ! M
  • v
e ??
  • V
: B
  • ar
d ! < ??
  • ...
14 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-15
SLIDE 15 P
  • ssible
Denition for T arget F unc- tion V
  • if
b is a nal b
  • ard
state that is w
  • n,
then V (b) = 100
  • if
b is a nal b
  • ard
state that is lost, then V (b) = 100
  • if
b is a nal b
  • ard
state that is dra wn, then V (b) =
  • if
b is a not a nal state in the game, then V (b) = V (b ), where b is the b est nal b
  • ard
state that can b e ac hiev ed starting from b and pla ying
  • ptimally
un til the end
  • f
the game. This giv es correct v alues, but is not
  • p
erational 15 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-16
SLIDE 16 Cho
  • se
Represen tation for T arget F unction
  • collecti
  • n
  • f
rules?
  • neural
net w
  • rk
?
  • p
  • lynomial
function
  • f
b
  • ard
features?
  • ...
16 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-17
SLIDE 17 A Represen tation for Learned F unc- tion w +w 1 bp(b)+w 2 r p(b)+w 3 bk (b)+w 4 r k (b)+w 5 bt(b)+w 6 r t(b)
  • bp(b):
n um b er
  • f
blac k pieces
  • n
b
  • ard
b
  • r
p(b): n um b er
  • f
red pieces
  • n
b
  • bk
(b): n um b er
  • f
blac k kings
  • n
b
  • r
k (b): n um b er
  • f
red kings
  • n
b
  • bt(b):
n um b er
  • f
red pieces threatened b y blac k (i.e., whic h can b e tak en
  • n
blac k's next turn)
  • r
t(b): n um b er
  • f
blac k pieces threatened b y red 17 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-18
SLIDE 18 Obtaining T raining Examples
  • V
(b): the true target function
  • ^
V (b) : the learned function
  • V
tr ain (b): the training v alue One rule for estimating training v alues:
  • V
tr ain (b) ^ V (S uccessor (b)) 18 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-19
SLIDE 19 Cho
  • se
W eigh t T uning Rule LMS W eigh t up date rule: Do rep eatedly:
  • Select
a training example b at random 1. Compute er r
  • r
(b): er r
  • r
(b) = V tr ain (b)
  • ^
V (b) 2. F
  • r
eac h b
  • ard
feature f i , up date w eigh t w i : w i w i + c
  • f
i
  • er
r
  • r
(b) c is some small constan t, sa y 0.1, to mo derate the rate
  • f
learning 19 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-20
SLIDE 20 Design Choices

Determine Target Function Determine Representation

  • f Learned Function

Determine Type

  • f Training Experience

Determine Learning Algorithm Games against self Games against experts Table of correct moves Linear function

  • f six features

Artificial neural network Polynomial Gradient descent Board ➝ value Board ➝ move Completed Design

... ...

Linear programming

... ...

20 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997
slide-21
SLIDE 21 Some Issues in Mac hine Learning
  • What
algorithms can appro ximate functions w ell (and when)?
  • Ho
w do es n um b er
  • f
training examples inuence accuracy?
  • Ho
w do es complexit y
  • f
h yp
  • thesis
represen tation impact it?
  • Ho
w do es noisy data inuence accuracy?
  • What
are the theoretical limits
  • f
learnabilit y?
  • Ho
w can prior kno wledge
  • f
learner help?
  • What
clues can w e get from biological learning systems?
  • Ho
w can systems alter their
  • wn
represen tations? 21 lecture slides for textb
  • k
Machine L e arning, T. Mitc hell, McGra w Hill, 1997