SLIDE 1 Calculating distributions
Chung-chieh Shan Indiana University 2018-09-21
Calculating distributions Chung-chieh Shan Indiana University - - PowerPoint PPT Presentation
Calculating distributions Chung-chieh Shan Indiana University - - PowerPoint PPT Presentation
Calculating distributions Chung-chieh Shan Indiana University 2018-09-21 Calculating distributions executable meaningful Calculating distributions executable meaningful Id also like to address this concept of being fake or
SLIDE 2 Calculating distributions
meaningful executable
SLIDE 3 Calculating distributions
meaningful executable
SLIDE 4
‘ ’
I’d also like to address this concept of being “fake” or “calculating. ” If being “fake” means not thinking or feeling the same way in one moment than you thought- r felt in a different moment,
SLIDE 5 Creative definitions and reasoning from first principles Symbolic representations
- f common definition
- f automation
SLIDE 6 Creative definitions and reasoning from first principles Symbolic representations
- f common definition
- f automation
SLIDE 7 Creative definitions and reasoning from first principles Symbolic representations
- f common definition
- f automation
- ptimization
SLIDE 8 8
An unknown random process yields a stateless coin that can be flipped repeatedly to produce heads (H) or tails (T). We assume that the probability p that the coin produces H each time is distributed uniformly between 0 and 1 by the process. We flip the coin 3 times and observe THH. What is the probability that the next flip produces H versus T? (adapted from Eddy)
SLIDE 9 9
An unknown random process yields a stateless coin that can be flipped repeatedly to produce heads (H) or tails (T). We assume that the probability p that the coin produces H each time is distributed uniformly between 0 and 1 by the process. We flip the coin 3 times and observe THH. What is the probability that the next flip produces H versus T? (adapted from Eddy)
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate
SLIDE 10 10
p p
- x
- x
- x
- x
SLIDE 11 11
p p
- x
- x
- x
- x
SLIDE 12 12
p p
- x
- x
- x
- x
SLIDE 13 13
p p
- x
- x
- x
- x
SLIDE 14 14
p p
- x
- x
- x
- x
SLIDE 15 15
p p
- x
- x
- x
- x
SLIDE 16 16
p p
- x
- x
- x
- x
SLIDE 17 17
Approximations calculated exactly
sampler prediction problems.
- 1. Introduction
- ne of the most widely used models in machine learn-
- 2. The Infinite Hidden Markov Model
- inition. A finite HMM consists of a hidden state se-
- 3. The Gibbs Sampler
- 4. The Beam Sampler
- 5. Experiments
SLIDE 18 18
Approximations calculated exactly
sampler prediction problems.
- 1. Introduction
- ne of the most widely used models in machine learn-
- 2. The Infinite Hidden Markov Model
- inition. A finite HMM consists of a hidden state se-
- 3. The Gibbs Sampler
- 4. The Beam Sampler
- 5. Experiments
- =
- T
- l
- n
- m
- f
- r
- t
- n
- f
- x
- y
- =
- d
- d
- µ
- d
- d
- x
- y
- <
- x
- y
- =
- y
- x
- f
- r
- y
- d
- =
- d
- d
SLIDE 19 19
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate
SLIDE 20 20
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate s i m p l i f y
200 400 600 800 200 400 Data size Time in seconds PSI
SLIDE 21 21
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate s i m p l i f y
200 400 600 800 200 400 Data size Time in seconds PSI
disintegrate simplify
Time in seconds Time in seconds Time in seconds Time in seconds Time in seconds Time in seconds Time in seconds Time in seconds Time in seconds Accuracy in % Accuracy in % Accuracy in % Accuracy in % Accuracy in % Accuracy in % Accuracy in % Accuracy in % Accuracy in % 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 8 8 8 8 8 8 8 8 8 10 10 10 10 10 10 10 10 10 30 30 30 30 30 30 30 30 30 35 35 35 35 35 35 35 35 40 40 40 40 40 40 40 40 40 45 45 45 45 45 45 45 45 50 50 50 50 50 50 50 50 50 55 55 55 55 55 55 55 55 60 60 60 60 60 60 60 60 60 Haskell-backend Haskell-backend Haskell-backend Haskell-backend Haskell-backend Haskell-backend Haskell-backend Haskell-backend Hakaru AugurV2 AugurV2 AugurV2 AugurV2 AugurV2 AugurV2 AugurV2 AugurV2 AugurV2 JAGS JAGS JAGS JAGS JAGS JAGS JAGS JAGS JAGS
· · ·
bind disintegrate simplify
Inference method Run time (msecs) Mean SD WebPPL 1078 16 Hakaru without simplifications 1321 93 Hakaru with simplifications 269 10 Handwritten 207 4
Put approximations in the language! (FLOPS 2016, UAI 2017)
SLIDE 22 22
p p
- x
- x
- x
- x
SLIDE 23 23
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
SLIDE 24 24
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
SLIDE 25 25
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
SLIDE 26 26
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
SLIDE 27 27
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
SLIDE 28 28
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
SLIDE 29 29
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
SLIDE 30 30
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
SLIDE 31 31
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
- p2(1 − p) · f(p) dp
SLIDE 32 32
Recognizing a density function
Program denote measures:
- p
- x
- =
- p
- ∈
- x∈{H,T}3
- ixi=H(1 − p)
- ix1=T
- p2(1 − p) · f(p) dp
SLIDE 33 33
Recognizing a density function
Goal: recognize h(p) = p2(1 − p) as the density of beta 3 2 Robustness challenge: many equivalent ways to write p2(1 − p) arise Modularity challenge: many distribution families (beta, normal, …) known
SLIDE 34 34
Recognizing a density function
Goal: recognize h(p) = p2(1 − p) as the density of beta 3 2 Robustness challenge: many equivalent ways to write p2(1 − p) arise Modularity challenge: many distribution families (beta, normal, …) known Solution: characterize density functions by their holonomic representation, a homogeneous linear differential equation such as p(1 − p) · h′(p) +
- p − 2(1 − p)
- · h(p) = 0
SLIDE 35 35
p p
- x
- x
- x
- x
SLIDE 36 36
Eliminating a random variable
Program denote measures:
- p
- =
- y
SLIDE 37 37
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 38 38
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 39 39
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 40 40
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 41 41
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 42 42
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
SLIDE 43 43
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
- y∈{H,T}
- 1
- p2(1 − p)py=H(1 − p)y=T dp
- · f(y)
SLIDE 44 44
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
- y∈{H,T}
- 1
- p2(1 − p)py=H(1 − p)y=T dp
- · f(y)
- y∈{H,T}
- 1
- y=H
- y=T
SLIDE 45 45
Eliminating a random variable
Program denote measures:
- p
- =
- y
- ∈
- p2(1 − p)
- y∈{H,T}
- y∈{H,T}
- 1
- p2(1 − p)py=H(1 − p)y=T dp
- · f(y)
- y∈{H,T}
- 1
- y=H
- y=T
SLIDE 46 46
p p
- x
- x
- x
- x
SLIDE 47 47
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate
SLIDE 48 48
An unknown random process yields a stateless particle whose one-dimensional position can be measured repeatedly to produce a real number. We assume that the position p of the particle is distributed normally with mean 3 and standard deviation 2. We measure the particle 3 times, each time drawing independently from the normal distribution with mean p and standard deviation 1, and observe
−1.4, +1.0, −0.2.
What is the distribution of the next measurement?
Pr(p) Pr(p,
x)
Pr(p |
x)
Pr(p, y |
x)
Pr(y |
x) bind disintegrate bind integrate
SLIDE 49 49
p p
- x
- x
SLIDE 50 50
p p
- x
SLIDE 51 51
p p
- x
SLIDE 52 52
Disintegrating a joint measure
p
- x
SLIDE 53 53
Disintegrating a joint measure
p
- x
- x
SLIDE 54 54
Disintegrating a joint measure
p
- x
- x
SLIDE 55 7 8 9
÷
4 5 6
×
1 2 3
− . = +
SLIDE 56 7
plate
8
lambda
9
apply ÷ disintegrate
4
pair
5
fst
6
snd × bind
1
dirac
2
beta
3
normal − gradient mzero . factor = simplify + mplus
SLIDE 57 ÷ disintegrate × bind − gradient mzero . factor = simplify + mplus
Thanks! Jacques Carette Oleg Kiselyov Wazim Mohammed Ismail Praveen Narayanan Norman Ramsey Wren Romano Sam Tobin-Hochstadt Rajan Walia Robert Zinkov