Lecture 7:
−Probability Review (cont’d.) −Maximum Likelihood Estimation (MLE)
Aykut Erdem
November 2018 Hacettepe University
Lecture 7: Probability Review (contd.) Maximum Likelihood - - PowerPoint PPT Presentation
Lecture 7: Probability Review (contd.) Maximum Likelihood Estimation (MLE) Aykut Erdem November 2018 Hacettepe University Administrative Assignment 2 will be out tonight It is due November 24 (i.e. in 2 weeks) You will
−Probability Review (cont’d.) −Maximum Likelihood Estimation (MLE)
Aykut Erdem
November 2018 Hacettepe University
− It is due November 24 (i.e. in 2 weeks) − You will implement
2
− problem to be investigated, − why it is interesting, − what data you will use, − related work.
3
4
!5
!6
!7
slide by Barnabás Póczos & Alex Smola
which A is true
P(A) is the volume of the area.
sample space
10
Example: What is the probability that
the number on the dice is 2 or 4?
1,3,5,6 2,4
What is the probability that the number on the dice is 2 or 4?
!8
slide by Barnabás Póczos & Alex Smola
!9
slide by Barnabás Póczos & Alex Smola
!10
P(A U B) = P(A) + P(B) - P(A B)
slide by Barnabás Póczos & Alex Smola
!11
class () is female
() from our class ()
Def: Real valued random variable is a function of the
Examples:
slide by Barnabás Póczos & Alex Smola
!12
Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails?
17
slide by Barnabás Póczos & Alex Smola
!13
Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails?
17
slide by Barnabás Póczos & Alex Smola
!14
Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails?
17
slide by Barnabás Póczos & Alex Smola
!15
Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails?
17
slide by Barnabás Póczos & Alex Smola
!16
P(X|Y) = Fraction of worlds in which X event is true given Y event is true.
X Y
XY
28
1/80 7/80 1/80 71/80
Headache Flu No Headache No Flu
slide by Barnabás Póczos & Alex Smola
!17
P(X|Y) = Fraction of worlds in which X event is true given Y event is true.
X Y
XY
28
1/80 7/80 1/80 71/80
Headache Flu No Headache No Flu
slide by Barnabás Póczos & Alex Smola
Independent: Winning on roulette this week and next week. Dependent: Russian roulette
!18
Independent random variables:
slide by Barnabás Póczos & Alex Smola
!19
X X Y Y
Independent X,Y Dependent X,Y
slide by Barnabás Póczos & Alex Smola
7
Dependent: shoe size of children and reading skills Conditionally independent: shoe size of children and reading skills given age Stork deliver babies: Highly statistically significant correlation exists between stork populations and human birth rates across Europe.
!20
Conditionally independent: Knowing Z makes X and Y independent
slide by Barnabás Póczos & Alex Smola
and significant correlation between the number of accidents and wearing coats. They concluded that coats could hinder movements of drivers and be the cause of
from wearing coats when driving.
!21
Finally, another study pointed out that people wear coats when it rains…
slide by Barnabás Póczos & Alex Smola
!22
Number people who drowned by falling into a swimming-pool correlates with Number of films Nicolas Cage appeared in
Correlation: 0.666004
!23
Equivalent to:
Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder
slide by Barnabás Póczos & Alex Smola
!24
Equivalent to:
Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder
slide by Barnabás Póczos & Alex Smola
!25
Equivalent to:
Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder
slide by Barnabás Póczos & Alex Smola
independently) draw a conclusion about what the number was.
are purely determined by the noise in each phone. P(na =1|nb =1,n=2)=P(na =1|n=2)
!26
nb. = 1) ?
n nb na
slide by Barnabás Póczos & Alex Smola
!27
slide by Barnabás Póczos & Alex Smola
!28
3/5
“Frequency of heads”
I have a coin, if I flip it, what’s the probability that it will fall with the head up?
Let us flip it a few times to estimate the probability:
The estimated probability is:
slide by Barnabás Póczos & Alex Smola
!29
3/5
“Frequency of heads”
I have a coin, if I flip it, what’s the probability that it will fall with the head up?
Let us flip it a few times to estimate the probability:
The estimated probability is:
slide by Barnabás Póczos & Alex Smola
!30
3/5
“Frequency of heads”
I have a coin, if I flip it, what’s the probability that it will fall with the head up?
Let us flip it a few times to estimate the probability:
The estimated probability is:
slide by Barnabás Póczos & Alex Smola
!31
3/5
“Frequency of heads”
I have a coin, if I flip it, what’s the probability that it will fall with the head up?
Let us flip it a few times to estimate the probability:
The estimated probability is:
slide by Barnabás Póczos & Alex Smola
We are going to answer these questions
!32
3/5 “Frequency of heads” The estimated probability is:
slide by Barnabás Póczos & Alex Smola
!33
slide by Barnabás Póczos & Alex Smola
!34
slide by Barnabás Póczos & Alex Smola
!35
Flips are i.i.d.:
– Independent events – Identically distributed according to Bernoulli distribution
Data, D = P(Heads) = θ, P(Tails) = 1-θ
MLE: Choose θ that maximizes the probability of observed data
slide by Barnabás Póczos & Alex Smola
!36
Flips are i.i.d.:
– Independent events – Identically distributed according to Bernoulli distribution
Data, D = P(Heads) = θ, P(Tails) = 1-θ
MLE: Choose θ that maximizes the probability of observed data
slide by Barnabás Póczos & Alex Smola
!37
Flips are i.i.d.:
– Independent events – Identically distributed according to Bernoulli distribution
Data, D = P(Heads) = θ, P(Tails) = 1-θ
MLE: Choose θ that maximizes the probability of observed data
slide by Barnabás Póczos & Alex Smola
!38
Flips are i.i.d.:
– Independent events – Identically distributed according to Bernoulli distribution
Data, D = P(Heads) = θ, P(Tails) = 1-θ
MLE: Choose θ that maximizes the probability of observed data
slide by Barnabás Póczos & Alex Smola
!39
MLE: Choose θ that maximizes the probability of observed data
independent draws iden,cally distributed
slide by Barnabás Póczos & Alex Smola
!40
MLE: Choose θ that maximizes the probability of observed data
independent draws iden,cally distributed
slide by Barnabás Póczos & Alex Smola
!41
MLE: Choose θ that maximizes the probability of observed data
independent draws identically distributed
slide by Barnabás Póczos & Alex Smola
!42
MLE: Choose θ that maximizes the probability of observed data
independent draws identically distributed
slide by Barnabás Póczos & Alex Smola
!43
MLE: Choose θ that maximizes the probability of observed data
independent draws identically distributed
slide by Barnabás Póczos & Alex Smola
!44
MLE: Choose θ that maximizes the probability of observed data
That’s exactly the “Frequency of heads”
!45
MLE: Choose θ that maximizes the probability of observed data
That’s exactly the “Frequency of heads”
!46
MLE: Choose θ that maximizes the probability of observed data
That’s exactly the “Frequency of heads”
!47
slide by Barnabás Póczos & Alex Smola
!48
slide by Barnabás Póczos & Alex Smola
Hoeffding’s inequality:
!49
slide by Barnabás Póczos & Alex Smola
!50
slide by Barnabás Póczos & Alex Smola
!51
slide by Barnabás Póczos & Alex Smola
!52
µ µ µ µ=0 µ µ µ µ=0 σ σ σ σ2
2 2 2
σ σ σ σ2
2 2 2
6 5 4 3 7 8 9
slide by Barnabás Póczos & Alex Smola
!53
Choose θ= (µ,σ2) that maximizes the probability of observed data
Independent draws Identically distributed
slide by Barnabás Póczos & Alex Smola
!54
[Expected result of estimation is not the true parameter!] Unbiased variance estimator:
slide by Barnabás Póczos & Alex Smola
!55
!56