Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
1
ProbabilisticModeling and JointDistributionModel Probabilistic - - PowerPoint PPT Presentation
ProbabilisticModeling and JointDistributionModel Probabilistic Modeling / Joint Distribution Model 1 Haluk Madencioglu ElementsofProbabilityTheory Introduction Concernedwithanalysisofrandomphenomena
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
1
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
2
Concernedwithanalysisofrandomphenomena Originatedfromgambling&games Usesideasofcounting,combinatoricsandmeasure
Usesmathematicalabstractionsofnon!deterministic
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
3
Continuousprobabilitytheorydealswitheventsthat
Discreteprobabilitydealswitheventsthatoccurin
Events:asetofoutcomesofanexperiment Events:asubsetofsamplespace
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
4
Nonnegativity :0≤P(E)≤1 Additivity : Normalization(unitmeasure):P(Ω) =1, P(∅)=0 Someconsequences:
P(Ω \ E) = 1-P(E)
P(A U B) = P(A) + P(B) – P(A∩B) P(A \ B) = P(A) – P(B) if B ⊆ A
1 2
n i i
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
5
Bayes Rule: P(A|B) =P(A,B) / P(B) OR:
P(A|B) =P(B|A).P(A) / P(B)
Independencycondition:P(A,B) = P(A).P(B) Mutuallyexclusiveevents:P(A,B) = 0 Mutuallyexclusiveevents:P(A U B) = P(A) + P(B)
OR P(A \ B) = P(A)
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
6
Avariable Afunctionmappingthesamplespaceofarandom
Valuescanbediscreteorcontinuous Eachoutcomeasvalue(orarange)isassigneda
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
7
Avariable Afunctionmappingthesamplespaceofarandom
Valuescanbediscreteorcontinuous Discreteexample:faircointoss X={ 1ifheads,0iftails} Orfairdiceroll:X={ “thenumbershownondice”}
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
8
Continuousexample:spinner Outcomecanbeanyrealnumberin [0,2π) Anyspecificvaluehaszeroprobability Soweuserangesinsteadofsinglepoints E.g.havingavaluein[0,π/2 ] hasprobability 1/4
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
9
X
Incaseofdiscreterandomvariablesweuse
Noticetheuseofuppercasefortherandomvariable
Cumulativedistributionfunction(CDF):
X
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
10
b X a
Incaseofcontinuousvariables, Weuseaprobabilitydensityfunction SothattheCDFbecomes
x X
−∞
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
11
Discreteuniformdistribution
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
12
Binomialdistribution Specialcase:n=1!>Bernoullidistribution
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
13
Specialcase:n=1!>Bernoullidistribution
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
14
Poissondistribution:neventsoccurwithaknown
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
15
Expectedvalue:Ameasureofprobabilityweighted
Variance:expectedvalueofthesquareofthe
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
16
Morethanonerandomvariable Onthesameprobabilityspace(universe) Eventsdefinedintermsofallvariables Calledmultivariatedistribution Calledbivariate iftwovariablesinvolved RememberingBayes rule,conditionaldistribution:
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
17
Similartoprobabilities,ifvariablesareindependent: Continuousdistributioncase: Marginaldistributions: Reducestosimpleproductsummationifindependent
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
18
1 2
n
Ingeneralasetofnrandomvariables: Withpossibleoutcomesforeachvariable: Aconfigurationisavectorofxwhereeachvalueis
CSCI6509NotesFall2009 FacultyofComputerScienceDalhousieUniversity
1 2
m
1 2
n
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
19
(1) ( )
t
Inmodelingweassumeasequenceofconfigurations:
configuration,andarevaluesfromfiniteset
(1) 11 12 1
n
(2) 21 22 2
n
( ) 1 2
t t t tn
ij
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
20
NLPusesprobabilisticmodelingasaframeworkfor
Computationaltasks:
Representationofmodels Simulation:generatingrandomconfigurations Evaluation:computingprobabilityofacompleteconfiguration Marginalization:computingprobabilityofapartialconfiguration Conditioning:computingconditionalprobabilityofcompletion
givenpartialobservation
Completion:findmostprobablecompletionofpartialobservation Learning:parameterestimation
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
21
Ajointprobabilitydistribution
Ingeneralittakesmxnparameters(lessone
1 1 2 2
n n
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
22
Thiscanbecapturedinlookuptable
So Satisfying
( )
( )
k
k x
( )
1
k
V x k
=
(1) (1) ( )
n V
x x x
( ) k
x
( ) k
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
23
Simulation:Giventhelookuptablerepresentation,computethe
cumulativevalueoftheconfigurations,selectthe whose cumulativeprobabilityintervalcontainsagivenpvalue
Evaluation:Evaluatetheprobabilityofacompleteconfiguration
Fromthelookuptable:
Marginalization:theprobabilityofanincompleteconfiguration:
Fromlookuptable:
( ) k
x
( ) k
1 2
n
1 2
1 1 ( ..... )
n
n n x x x
1
1 1 1 1 1 1
( ,... ) .... ( ,... , ...., )
k n
n n k k k k n n y y
P X x X x P X x X x X y X y
+
+ +
= = = = = = =
1 2 1 1
( ..... , ..... )
k k n k n
x x x y y y y
+ +
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
24
Completion: Computetheconditionalprobabilityofapossible
completion givenanincompleteconfiguration Needtoevaluateacompleteconfigurationandthendividebya marginalsum
1 2
k k n
+ + 1 2
n
1 2 1 1 2 1 1
( ..... .... ) ( ..... , ..... )
k k n k k n k n
x x x y y x x x z z z z
+ + +
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
25
Spamdetection:anarbitrarye!mailmessageis
ignored),‘N’ otherwise, and
Randomlyselect100messages,counthowmanytimeseachconfiguration appears
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
26
probability of any configuration. For example: P(Free = Y; Caps = Y; Spam = Y ) = 0.2 P(Free = Y; Caps = N; Spam = N) = 0.0
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
27
DrawbacksofJointDistributionModel: memorycosttostoretable running!timecosttodosummations thesparsedataprobleminlearning
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
28
Ideafortraditionalgenerativemodel: whatdoestheautomatonbelowgenerate ?
knowthatskyisblue,…
I know that he knows sky is blue
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
29
Ideaforprobabilisticgenerativemodel:
P(STOP|Qi) = 0.2 (Manning,Raghavan &Schutze,2009)
Ifinsteadeachnodehasaprobabilitydistribution
Qi
string assigned probability the 0.2 a 0.1 frog 0.01 toad 0.01 said 0.03 likes 0.02 that 0.04 …. ….
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
30
*
i i
∈Σ
Alanguagemodelisafunctionthatputsa
Eachisatermemissionprobabilityinthis
Suchamodelplacesaprobabilitydistributionover
Byconstruction,italsoprovidesamodelfor
i
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
31
P(frogsaidthattoadlikesfrog)=(0.01×0.03× 0.04
Usuallycontinue/stopprobabilitiesareomittedwhen
Basedoncomputedvalue,amodelismorelikely
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
32
Comparethismodeltothepreviousmodel:
string assigned probability the 0.15 a 0.12 frog 0.0002 toad 0.0001 said 0.03 likes 0.04 that 0.04 …. ….
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
33
Ingeneralforasequenceofeventsusingearlier
Iftotalindependenceamongeventsexists: Thisisunigrammodel
1 2 3 4 1 2 1 3 1 2 4 1 2 3
1 2 3 4 1 2 3 4
uni
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
34
Ifonlyconditioningisonthepreviousterm Thisisbigram model Unigrammodelsfrequentlyusedwhensentence
E.g.inIRbutnotinspeechrecognition
1 2 3 4 1 2 1 3 2 4 3
bi
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
35
Unigrammodelsareoftype‘bagofwords’ Recallsamultinomialdistributionofprobabilities
1 2 3 4 1 2 1 3 2 4 3
bi
, , , 1 2 1 2
1 2 , , ,
t d t d tM d M
f f f d M t d t d t d
d
L
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
36
Fundamentalquestion:whichmodeltouse? Speechrecognition:themodelhastobegeneral
IR:adocumentisfiniteandmostlyfixed
Getarepresentativesample Buildalanguagemodelfordocument Calculategenerative probabilitiesofsequencesfromthemodel Rankdocumentsbyprobabilityrankingprinciple
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
37
rankdocumentsbytheirestimatedprobabilityof
P(R=1|d,q)fordocumentd,queryq Basiccase:1/0loss Rankdocuments,returntopk Nonrestrictivecase:Bayes optimaldecisionrule disrelevantiffP(R=1|d,q)>P(R=0|d,q)
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
38
Ifcostisinvolved:
C0 · P(R = 0|d) − C1 · P(R = 1|d) ≤ C0 · P(R = 0|d′) − C1 · P(R = 1|d′)) where C1=costofmissingrelevantdocument C0=costofreturningnonrelevantdocument
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
39
Ratherthanadocumentmodel,andchecking
Buildaquerymodelandchecklikelihoodof
OR:usebothapproachestogether
Needsameasureofdivergencebetweendocumentandquerymodels Kullback!Leibler divergence:
q q t V d
∈
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
40
Translationalmodelgeneratesquerywordsnotina
Needstoknowconditionalprobabilitydistribution
Whereisthequerytranslationmodel,
isthe documentlanguagemodel,istheconditionalprobability distributionbetweenvocabularyterms
d d v V t q
∈ ∈
( | )
d
P q M
( | )
d
P v M
( | ) T t v
Probabilistic Modeling / Joint Distribution Model Haluk Madencioglu
41
CSCI6509NotesFall2009 FacultyofComputerScienceDalhousieUniversity http://www.cs.dal.ca/~vlado/csci6509/coursecalendar.html Manning,Raghavan &Schutze,2009,Anintroductiontoinformation retrieval Jurafsky,Martin,2000,AnIntroductiontoNLP,Computational LinguisticsandSpeechRecognition Ghahramani,2000,FundamentalsofProbability