Hidden Markov Models + Midterm Exam 2 Review Matt Gormley - - PowerPoint PPT Presentation

hidden markov models midterm exam 2 review
SMART_READER_LITE
LIVE PREVIEW

Hidden Markov Models + Midterm Exam 2 Review Matt Gormley - - PowerPoint PPT Presentation

10 601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models + Midterm Exam 2 Review Matt


slide-1
SLIDE 1

HiddenMarkovModels + MidtermExam2Review

1

10601IntroductiontoMachineLearning

MattGormley Lecture19 Mar.27,2020

MachineLearningDepartment SchoolofComputerScience CarnegieMellonUniversity

slide-2
SLIDE 2

Reminders

Homework 6:Learning Theory /Generative Models

Out:Fri,Mar20 Due:Fri,Mar27at11:59pm

Practice Problems for Exam 2

Out:Fri,Mar20

Midterm Exam 2

Thu,Apr 2– evening exam,details announced on Piazza

Today’s InClass Poll

http://poll.mlcourse.org

2

slide-3
SLIDE 3

MIDTERMEXAMLOGISTICS

3

slide-4
SLIDE 4

MidtermExam

  • Time/Location

Time:EveningExam Thu,Apr.2at6:00pm– 9:00pm Location:Wewillcontactyouwithadditionaldetailsabouthowtojointhe appropriateZoommeeting. Seats:TherewillbeassignedZoomrooms.Pleasearriveonlineearly. PleasewatchPiazzacarefullyforannouncements.

  • Logistics

Coveredmaterial:Lecture9– Lecture18(95%),Lecture1– 8(5%) Formatofquestions:

  • Multiplechoice
  • True/False(withjustification)
  • Derivations
  • Shortanswers
  • Interpretingfigures
  • Implementingalgorithmsonpaper

Noelectronicdevices Youareallowedtobring one8½x11sheetofnotes(frontandback)

4

slide-5
SLIDE 5

MidtermExam

HowtoPrepare

Attendthemidtermreviewlecture (rightnow!) Reviewprioryear’sexamandsolutions (we’llpostthem) Reviewthisyear’shomeworkproblems Considerwhetheryouhaveachievedthe “learningobjectives”foreachlecture/section

5

slide-6
SLIDE 6

MidtermExam

Advice(forduringtheexam)

Solvetheeasyproblemsfirst (e.g.multiplechoicebeforederivations)

ifaproblemseemsextremelycomplicatedyou’relikely missingsomething

Don’tleaveanyanswerblank! Ifyoumakeanassumption,writeitdown Ifyoulookataquestionanddon’tknowthe answer:

weprobablyhaven’ttoldyoutheanswer butwe’vetoldyouenoughtoworkitout imaginearguingforsomeanswerandseeifyoulikeit

6

slide-7
SLIDE 7

TopicsforMidterm1

Foundations

Probability,Linear Algebra,Geometry, Calculus Optimization

ImportantConcepts

Overfitting ExperimentalDesign

Classification

DecisionTree KNN Perceptron

Regression

LinearRegression

7

slide-8
SLIDE 8

TopicsforMidterm2

Classification

BinaryLogisticRegression MultinomialLogistic Regression

ImportantConcepts

StochasticGradient Descent Regularization FeatureEngineering

FeatureLearning

NeuralNetworks BasicNNArchitectures Backpropagation

LearningTheory

PACLearning

GenerativeModels

Generativevs. Discriminative MLE/MAP NaïveBayes

8

slide-9
SLIDE 9

SAMPLEQUESTIONS

9

slide-10
SLIDE 10

SampleQuestions

10

slide-11
SLIDE 11

SamplesQuestions

11

slide-12
SLIDE 12

SamplesQuestions

12

slide-13
SLIDE 13

SampleQuestions

13

slide-14
SLIDE 14

SampleQuestions

14

slide-15
SLIDE 15

SampleQuestions

15

slide-16
SLIDE 16

SampleQuestions

16

slide-17
SLIDE 17

SampleQuestions

17

slide-18
SLIDE 18

SampleQuestions

18

slide-19
SLIDE 19

HIDDENMARKOV MODEL(HMM)

19

slide-20
SLIDE 20

HMMOutline

  • Motivation
  • TimeSeriesData
  • HiddenMarkovModel(HMM)
  • Example:SquirrelHillTunnelClosures

[courtesyofRoniRosenfeld]

  • Background:MarkovModels
  • FromMixtureModeltoHMM
  • HistoryofHMMs
  • HigherorderHMMs
  • TrainingHMMs
  • (Supervised)LikelihoodforHMM
  • MaximumLikelihoodEstimation(MLE)forHMM
  • EMforHMM(aka.BaumWelchalgorithm)
  • ForwardBackwardAlgorithm
  • ThreeInferenceProblemsforHMM
  • GreatIdeasinML:MessagePassing
  • Example:ForwardBackwardon3wordSentence
  • DerivationofForwardAlgorithm
  • ForwardBackwardAlgorithm
  • Viterbialgorithm

20

slide-21
SLIDE 21

MarkovModels

Whiteboard

Example:TunnelClosures [courtesyofRoniRosenfeld] FirstorderMarkovassumption Conditionalindependenceassumptions

21

slide-22
SLIDE 22

22

slide-23
SLIDE 23

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

Totoro’sTunnel

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28

2m 3m 18m 9m 27m

O S S O C

MixtureModelforTimeSeriesData

29

Wecouldtreateach(tunnelstate,traveltime)pairasindependent.This correspondstoaNaïveBayesmodelwithasinglefeature(traveltime).

O .8 S .1 C .1

pO, S, S, O, C, 2m, 3m, 18m, 9m, 27m= (.8*.2*.1*.03*…)

O .8 S .1 C .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0

slide-29
SLIDE 29

2m 3m 18m 9m 27m

O S S O C 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0

HiddenMarkovModel

30

AHiddenMarkovModel(HMM)providesajointdistributionoverthethe tunnelstates/traveltimeswithanassumptionofdependencebetween adjacenttunnelstates.

pO, S, S, O, C, 2m, 3m, 18m, 9m, 27m= (.8*.08*.2*.7*.03*…)

O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 O .8 S .1 C .1

slide-30
SLIDE 30

HMM: “NaïveBayes”:

FromMixtureModeltoHMM

31

X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5

slide-31
SLIDE 31

HMM: “NaïveBayes”:

FromMixtureModeltoHMM

32

X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0 X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5

slide-32
SLIDE 32

SUPERVISEDLEARNINGFOR HMMS

37

slide-33
SLIDE 33

RecipeforClosedformMLE

1. Assumedatawasgeneratedi.i.d.fromsomemodel (i.e.writethegenerativestory) x(i) ~p(x|) 2. Writeloglikelihood

l()=log p(x(1)|)+…+log p(x(N)|)

3. Computepartialderivatives(i.e.gradient) l()/1 =… l()/2 =… … l()/M =… 4. Setderivativestozeroandsolvefor l()/m =0forallm {1,…,M} MLE =solutiontosystemofMequationsandMvariables 5. Computethesecondderivativeandcheckthatl()isconcavedown atMLE

38

slide-34
SLIDE 34

MLEofCategoricalDistribution

39

slide-35
SLIDE 35

HMMParameters:

HiddenMarkovModel

41

X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O S C O .9 .08.02 S .2 .7 .1 C .9 0 .1 1min 2min 3min … O .1 .2 .3 S .01 .02.03 C 0 O .8 S .1 C .1

slide-36
SLIDE 36

TrainingHMMs

Whiteboard

(Supervised)LikelihoodforanHMM MaximumLikelihoodEstimation(MLE)forHMM

42

slide-37
SLIDE 37

SupervisedLearningforHMMs

Learningan HMM decomposes intosolvingtwo (independent) MixtureModels

43

Yt Yt+1 Xt Yt

slide-38
SLIDE 38

HMMParameters: Assumption: GenerativeStory:

HiddenMarkovModel

44

X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0

Fornotational convenience,wefoldthe initialprobabilities C into thetransitionmatrixB by

  • urassumption.
slide-39
SLIDE 39

JointDistribution:

HiddenMarkovModel

45

X1 X2 X3 X4 X5 Y1 Y2 Y3 Y4 Y5 Y0

slide-40
SLIDE 40

SupervisedLearningforHMMs

Learningan HMM decomposes intosolvingtwo (independent) MixtureModels

46

Yt Yt+1 Xt Yt

slide-41
SLIDE 41

UnsupervisedLearningforHMMs

  • Unlikediscriminative modelsp(y|x),generative modelsp(x,y)

canmaximizethelikelihoodofthedataD={x(1),x(2),…,x(N)} wherewedon’tobserveanyy’s.

  • Thisunsupervisedlearning settingcanbeachievedbyfinding

parametersthatmaximizethemarginallikelihood

  • WeoptimizeusingtheExpectationMaximization algorithm

47

slide-42
SLIDE 42
  • SlidefromWilliamCohen
slide-43
SLIDE 43

HigherorderHMMs

1storderHMM(i.e.bigramHMM) 2ndorderHMM(i.e.trigramHMM) 3rdorderHMM

49 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

slide-44
SLIDE 44

HigherorderHMMs

1storderHMM(i.e.bigramHMM) 2ndorderHMM(i.e.trigramHMM) 3rdorderHMM

50 Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

Y1 Y2 Y3 Y4 Y5 X1 X2 X3 X4 X5

<START>

Hidden States,y Observa tions,x