Representation Learning Lecture slides for Chapter 15 of Deep - - PowerPoint PPT Presentation

representation learning
SMART_READER_LITE
LIVE PREVIEW

Representation Learning Lecture slides for Chapter 15 of Deep - - PowerPoint PPT Presentation

Representation Learning Lecture slides for Chapter 15 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2017-10-03 Unsupervised Pretraining Usually Hurts but Sometimes Helps Harm done by pretraining (Ma et al, 2015) Average advantage


slide-1
SLIDE 1

Representation Learning

Lecture slides for Chapter 15 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2017-10-03

slide-2
SLIDE 2

(Goodfellow 2017)

Unsupervised Pretraining Usually Hurts but Sometimes Helps

(Ma et al, 2015) Break-even point Average advantage

  • f not pretraining

Harm done by pretraining Many different chemistry datasets

slide-3
SLIDE 3

(Goodfellow 2017)

Pretraining Changes Learning Trajectory

−4000 −3000 −2000 −1000 1000 2000 3000 4000 −1500 −1000 −500 500 1000 1500

With pretraining Without pretraining

Figure 15.1

slide-4
SLIDE 4

(Goodfellow 2017)

Representation Sharing for Multi-Task or Transfer Learning

Selection switch h(1) h(1) h(2) h(2) h(3) h(3) y y h(shared) h(shared) x(1) x(1) x(2) x(2) x(3) x(3)

Figure 15.2 One representation used for many input formats

  • r many tasks
slide-5
SLIDE 5

(Goodfellow 2017)

Zero Shot Learning

hx = fx(x) xtest ytest hy = fy(y) y−space Relationship between embedded points within one of the domains Maps between representation spaces fx fy x−space (x, y) pairs in the training set fx : encoder function for x fy : encoder function for y

Figure 15.3

slide-6
SLIDE 6

(Goodfellow 2017)

Mixture Modeling Discovers Separate Classes

x p(x) y=1 y=2 y=3

Figure 15.4

slide-7
SLIDE 7

(Goodfellow 2017)

Mean Squared Error Can Ignore Small but Task-Relevant Features

Figure 15.5

Input Reconstruction

The ping pong ball vanishes because it is not large enough to significantly affect the mean squared error

slide-8
SLIDE 8

(Goodfellow 2017)

Adversarial Losses Preserve Any Features with Highly Structured Patterns

Figure 15.6

Ground Truth MSE Adversarial

Mean squared error loses the ear because it causes a small change in few pixels. Adversarial loss preserves the ear because it is easy to notice its absence.

slide-9
SLIDE 9

(Goodfellow 2017)

Binary Distributed Representations Divide Space Into Many Uniquely Identifiable Regions

h1 h2 h3

h = [1, 1, 1]> h = [0, 1, 1]> h = [1, 0, 1]> h = [1, 1, 0]> h = [0, 1, 0]> h = [0, 0, 1]> h = [1, 0, 0]>

Figure 15.7

slide-10
SLIDE 10

(Goodfellow 2017)

Binary Distributed Representations Divide Space Into Many Uniquely Identifiable Regions

h1 h2 h3

h = [1, 1, 1]> h = [0, 1, 1]> h = [1, 0, 1]> h = [1, 1, 0]> h = [0, 1, 0]> h = [0, 0, 1]> h = [1, 0, 0]>

Figure 15.7

slide-11
SLIDE 11

(Goodfellow 2017)

Nearest Neighbor Divides Space into one Region Per Centroid

Figure 15.8

slide-12
SLIDE 12

(Goodfellow 2017)

GANs learn vector spaces that support semantic arithmetic

  • +

=

Figure 15.9