Linear Factor Models Lecture slides for Chapter 13 of Deep Learning - - PowerPoint PPT Presentation

linear factor models
SMART_READER_LITE
LIVE PREVIEW

Linear Factor Models Lecture slides for Chapter 13 of Deep Learning - - PowerPoint PPT Presentation

Linear Factor Models Lecture slides for Chapter 13 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-27 Linear Factor Models h 1 h 1 h 2 h 2 h 3 h 3 x 1 x 1 x 2 x 2 x 3 x 3 x = W h + b + noise x = W h + b + noise Figure 13.1


slide-1
SLIDE 1

Linear Factor Models

Lecture slides for Chapter 13 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-27

slide-2
SLIDE 2

(Goodfellow 2016)

Linear Factor Models

h1 h1 h2 h2 h3 h3 x1 x1 x2 x2 x3 x3 x = W h + b + noise x = W h + b + noise

Figure 13.1

slide-3
SLIDE 3

(Goodfellow 2016)

Probabilistic PCA and Factor Analysis

  • Linear factor model
  • Gaussian prior
  • Extends PCA
  • Given an input, yields a distribution over codes, rather

than a single code

  • Estimates a probability density function
  • Can generate samples
slide-4
SLIDE 4

(Goodfellow 2016)

Independent Components Analysis

  • Factorial but non-Gaussian prior
  • Learns components that are closer to statistically

independent than the raw features

  • Can be used to separate voices of n speakers

recorded by n microphones, or to separate multiple EEG signals

  • Many variants, some more probabilistic than others
slide-5
SLIDE 5

(Goodfellow 2016)

Slow Feature Analysis

  • Learn features that change gradually over time
  • SFA algorithm does so in closed form for a linear

model

  • Deep SFA by composing many models with fixed

feature expansions, like quadratic feature expansion

slide-6
SLIDE 6

(Goodfellow 2016)

Sparse Coding

p(x | h) = N(x; W h + b, 1 β I). (13.12)

p(hi) = Laplace(hi; 0, 2 λ) = λ 4 e− 1

2 λ|hi|

(13.13)

= arg min

h

λ||h||1 + β||x − W h||2

2,

(13.18)

slide-7
SLIDE 7

(Goodfellow 2016)

Sparse Coding

Samples Weights Figure 13.2

slide-8
SLIDE 8

(Goodfellow 2016)

Manifold Interpretation of PCA

e 13.3: Flat Gaussian capturing probability concentration near a low-dimen

Figure 13.3