Autoregressive Models
Stefano Ermon, Aditya Grover
Stanford University
Lecture 3
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 1 / 1
Autoregressive Models Stefano Ermon, Aditya Grover Stanford - - PowerPoint PPT Presentation
Autoregressive Models Stefano Ermon, Aditya Grover Stanford University Lecture 3 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 1 / 1 Learning a generative model We are given a training set of examples, e.g., images
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 1 / 1
1
2
3
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 2 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 3 / 1
i=1 αixi.
i=1 αihi)
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 4 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 5 / 1
0 + α2 1v1)
0 + α3 1v1 + α3 2v2)
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 6 / 1
0 + i−1
jvj)
1
2
3
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 7 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 8 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 9 / 1
h2 = σ . . .
w1
. . .
v1 h3 = σ . . . . . .
w1 w2
. . . . . .
( v1 v2 ) h4 = σ . . . . . . . . .
w1 w2 w3
. . . . . . . . .
( v1 v2 v3 )
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 10 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 11 / 1
i , · · · , pK i )
i , · · · , pK i ) = softmax(Vihi + bi)
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 12 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 13 / 1
K
i, σj i )
i , · · · , µK i , σ1 i , · · · , σK i ) = f (hi)
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 14 / 1
W 1,W 2,b1,b2,V ,c
W 1,W 2,b1,b2,V ,c
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 15 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 16 / 1
1
2
3
1
2
3
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 17 / 1
1 Hidden layer ht is a summary of the inputs seen till time t 2 Output layer ot−1 specifies parameters for conditional p(xt | x1:t−1) 3 Parameterized by b0 (initialization), and matrices Whh, Wxh, Why.
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 18 / 1
1 Suppose xi ∈ {h, e, l, o}. Use one-hot encoding:
2 Autoregressive: p(x = hello) = p(x1 = h)p(x2 = e|x1 = h)p(x3 =
3 For example,
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 19 / 1
1 Can be applied to sequences of arbitrary length. 2 Very general: For every computable function, there exists a finite
1 Still requires an ordering 2 Sequential likelihood evaluation (very slow for training) 3 Sequential generation (unavoidable in an autoregressive model) 4 Can be difficult to train (vanishing/exploding gradients) Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 20 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 21 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 22 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 23 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 24 / 1
1
2
t
t
t
t
t
t
3
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 25 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 26 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 27 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 28 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 29 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 30 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 31 / 1
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 32 / 1
1
2
3
1
2
3
4
5
Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 3 33 / 1