SLIDE 20 Introduction
Parametric Models
A parametric model M : W → YX is a functional mapping from weight W to a hypothesis M(W) : X → Y. Fully-connected (FC) Neural Networks: Rd → R FC-NN[W](x) = WLσ(WL−1 · · · σ(W2σ(W1x + b1) + b2) + bL−1) + bL, where W = ({Wi}L
i=1, {bi}L i=1), Wi ∈ Rdi−1×di, bi ∈ Rdi, d0 = d, and dL = 1. Here, σ : R → R is the
activation function, and we abuse the notation such that σ is also defined for vector inputs, i.e. that [σ(x)]i = σ(xi). Convolutional Neural Networks(CNN): Rd → R CNN[W](x) = r
i=1 arσ([w ∗ x]d′(r−1)+1:d′r) + b,
where W = (w, a, b) ∈ Rk × Rr × R, d = d′r. ∗ : Rk × Rd → Rd is the convolution operator, defined as [w ∗ x]i = k
j=1 wjx[i−j−1 mod d]+1, and σ : Rd′ → R is the composition of pooling and element-wise
non-linearity.
Zhiyuan Li (Princeton University) Fully-Connected Nets vs Conv Nets August 19, 2020 @ IJTCS 7 / 30