HiddenMarkovModels + MidtermExam2Review
1
10601IntroductiontoMachineLearning
MattGormley Lecture19 Mar.27,2020
MachineLearningDepartment SchoolofComputerScience CarnegieMellonUniversity
Hidden Markov Models + Midterm Exam 2 Review Matt Gormley - - PowerPoint PPT Presentation
10 601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models + Midterm Exam 2 Review Matt
1
10601IntroductiontoMachineLearning
MattGormley Lecture19 Mar.27,2020
MachineLearningDepartment SchoolofComputerScience CarnegieMellonUniversity
Out:Fri,Mar20 Due:Fri,Mar27at11:59pm
Out:Fri,Mar20
Thu,Apr 2– evening exam,details announced on Piazza
http://poll.mlcourse.org
2
3
Time:EveningExam Thu,Apr.2at6:00pm– 9:00pm Location:Wewillcontactyouwithadditionaldetailsabouthowtojointhe appropriateZoommeeting. Seats:TherewillbeassignedZoomrooms.Pleasearriveonlineearly. PleasewatchPiazzacarefullyforannouncements.
Coveredmaterial:Lecture9– Lecture18(95%),Lecture1– 8(5%) Formatofquestions:
Noelectronicdevices Youareallowedtobring one8½x11sheetofnotes(frontandback)
4
5
Solvetheeasyproblemsfirst (e.g.multiplechoicebeforederivations)
ifaproblemseemsextremelycomplicatedyou’relikely missingsomething
Don’tleaveanyanswerblank! Ifyoumakeanassumption,writeitdown Ifyoulookataquestionanddon’tknowthe answer:
weprobablyhaven’ttoldyoutheanswer butwe’vetoldyouenoughtoworkitout imaginearguingforsomeanswerandseeifyoulikeit
6
Probability,Linear Algebra,Geometry, Calculus Optimization
Overfitting ExperimentalDesign
DecisionTree KNN Perceptron
LinearRegression
7
Classification
BinaryLogisticRegression MultinomialLogistic Regression
ImportantConcepts
StochasticGradient Descent Regularization FeatureEngineering
FeatureLearning
NeuralNetworks BasicNNArchitectures Backpropagation
LearningTheory
PACLearning
GenerativeModels
Generativevs. Discriminative MLE/MAP NaïveBayes
8
9
10
11
12
13
14
15
16
17
18
19
[courtesyofRoniRosenfeld]
20
21
22
23
24
25
26
27
2m 3m 18m 9m 27m
O S S O C
29
Wecouldtreateach(tunnelstate,traveltime)pairasindependent.This correspondstoaNaïveBayesmodelwithasinglefeature(traveltime).
O .8 S .1 C .1
O .8 S .1 C .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0
2m 3m 18m 9m 27m
O S S O C 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0
30
AHiddenMarkovModel(HMM)providesajointdistributionoverthethe tunnelstates/traveltimeswithanassumptionofdependencebetween adjacenttunnelstates.
O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 O .8 S .1 C .1
31
X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5
32
X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0 X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5
37
1. Assumedatawasgeneratedi.i.d.fromsomemodel (i.e.writethegenerativestory) x(i) ~p(x|) 2. Writeloglikelihood
l()=log p(x(1)|)+…+log p(x(N)|)
3. Computepartialderivatives(i.e.gradient) l()/1 =… l()/2 =… … l()/M =… 4. Setderivativestozeroandsolvefor l()/m =0forallm {1,…,M} MLE =solutiontosystemofMequationsandMvariables 5. Computethesecondderivativeandcheckthatl()isconcavedown atMLE
38
39
41
X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O .8 S .1 C .1
42
Learningan HMM decomposes intosolvingtwo (independent) MixtureModels
43
Yt Yt+1 Xt Yt
44
X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0
Fornotational convenience,wefoldthe initialprobabilities C into thetransitionmatrixB by
45
X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0
Learningan HMM decomposes intosolvingtwo (independent) MixtureModels
46
Yt Yt+1 Xt Yt
canmaximizethelikelihoodofthedataD={x(1),x(2),…,x(N)} wherewedon’tobserveanyy’s.
parametersthatmaximizethemarginallikelihood
47
49 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>
Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>
Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>
50 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>
Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>
Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5
<START>