Weakly Supervised Disentanglement with Guarantees Rui Shu Joint - - PowerPoint PPT Presentation

weakly supervised disentanglement with guarantees
SMART_READER_LITE
LIVE PREVIEW

Weakly Supervised Disentanglement with Guarantees Rui Shu Joint - - PowerPoint PPT Presentation

Weakly Supervised Disentanglement with Guarantees Rui Shu Joint work with Yining Chen, Abhishek Kumar, Stefano Ermon, Ben Poole Why Decompose data into a set of underlying Explainable models human-interpretable factors of variation What is in


slide-1
SLIDE 1

Weakly Supervised Disentanglement with Guarantees

Rui Shu

Joint work with Yining Chen, Abhishek Kumar, Stefano Ermon, Ben Poole

slide-2
SLIDE 2

Why

Explainable models What is in the scene? Controllable generation Generate a red ball instead

Blue sky Pink wall Small purple ball Green floor Decompose data into a set of underlying human-interpretable factors of variation

2

slide-3
SLIDE 3

How: Fully-Supervised

Strategy: Label everything

{dark blue wall, green floor, green oval} {green wall, red floor, green cylinder} {red wall, green floor, pink ball} Controllable generation as label-conditional generative modeling green wall, red floor, blue cylinder

3

slide-4
SLIDE 4

How: Fully-Supervised

Problem: Some things are hard to label

What kind of glasses? What kind of hairstyle? Generate this guy with this hair

4

slide-5
SLIDE 5

How: Unsupervised?

Strategy: Exploit statistical independence assumption + neural net magic

Beta-VAE TC-VAE FactorVAE Swivel the chair

5

slide-6
SLIDE 6

How: Unsupervised?

Problem: Is statistical independence assumption + neural net magic enough?

Z1: Shape Z2: Shading vs

Locatello, et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations, ICML 2019. 6

slide-7
SLIDE 7

How: Weakly Supervised

Strategy: Leverage “weak” supervision when possible

7

slide-8
SLIDE 8

How: Weakly Supervised

Restricted Labeling: Label what we can

Pink wall Purple ball Green floor

8

Size: ¯\_(ツ)_/¯

slide-9
SLIDE 9

How: Weakly Supervised

Match Pairing: Find pairs with known similarities

Same ground color Real world data: direct intervention to share / change certain factors

9

slide-10
SLIDE 10

How: Weakly Supervised

Rank Pairing: Compare pairs

Which is bigger?

10

slide-11
SLIDE 11

The Plan

1. Definitions: Decompose disentanglement into:

a. Consistency b. Restrictiveness

2. Guarantees: Prove whether weak supervision guarantees consistency, restrictiveness, or both

Departure from existing literature: no end-to-end theoretical framework of disentanglement

11

slide-12
SLIDE 12

Definitions

Disentangle: What does it mean when I say Z1 disentangles size?

12

1. When z1 is fixed, is size fixed? 2. When we only change z1, does only size change?

slide-13
SLIDE 13

Definitions

Disentangle: What does it mean when I say Z1 disentangles size?

13

1. When z1 is fixed, is size fixed? (Consistency) 2. When we only change z1, does only size change? (Restrictiveness)

slide-14
SLIDE 14

Definitions: Consistency

14

When ZI is fixed, SI is fixed

Oracle encoder Generative model Perturbation-based generation

slide-15
SLIDE 15

Definitions: Restrictiveness

15

When only ZI is changed, only SI is changed

Equivalently: when Z\I is fixed, S\I is fixed Oracle encoder Generative model Perturbation-based generation

slide-16
SLIDE 16

Definitions: Disentanglement

16

ZI is consistent and restricted to SI

slide-17
SLIDE 17

Consistency versus Restrictiveness

17

When only ZI is changed, only SI is changed

Equivalently: when Z\I is fixed, S\I is fixed

slide-18
SLIDE 18

Consistency versus Restrictiveness

18

slide-19
SLIDE 19

Union Rules

19

Consistency Union: If fixing ZI fixes SI and fixing ZJ fixes SJ then fixing ( ZI , ZJ ) fixes ( SI , SJ ) Restrictiveness Union: If changing ZI changes only SI and changing ZJ changes only SJ then changing ( ZI , ZJ ) changes only ( SI , SJ )

slide-20
SLIDE 20

Intersection Rules

20

Consistency Intersection: If fixing ZI fixes SI and fixing ZJ fixes SJ then fixing ZV fixes SV Restrictiveness Intersection: If changing ZI changes only SI and changing ZJ changes only SJ then changing ZV changes only SV

slide-21
SLIDE 21

Disentanglement Rule

21

Disentanglement via Consistency Consistency on all factors implies disentanglement on all factors Disentanglement via Restrictiveness Restrictiveness on all factors implies disentanglement on all factors

slide-22
SLIDE 22

Summary of Rules

22

slide-23
SLIDE 23

Summary of Rules

23

slide-24
SLIDE 24

Strategy for Disentanglement

Dataset 1 → C(1) Dataset 2 → C(2) … Dataset n → C(n) Using datasets together (+ right algorithm) guarantees full disentanglement

24

slide-25
SLIDE 25

Restricted Labeling Guarantees Consistency

25

sI s\I x

Distribution Match

zI z\I x

ZI will be consistent with SI

slide-26
SLIDE 26

Match Pairing Guarantees Consistency

26

ZI will be consistent with SI Distribution Match

sI s’\I s\I x x’ zI z’\I z\I x x’

slide-27
SLIDE 27

Rank Pairing Guarantees Consistency

27

Distribution Match

s\i si s’\i s’i x x’ y z\i zi z’\i z’i x x’ y

ZI will be consistent with SI

slide-28
SLIDE 28

Summary of Guarantees

28

slide-29
SLIDE 29

Targeted Consistency / Restrictiveness

29

Generative model trained via restricted labeling at S5 Evaluated model on consistency of Z0 vs S0

slide-30
SLIDE 30

Targeted Consistency / Restrictiveness

30 Consistency: Restricted Labeling Consistency: Match Pairing (Share 1 factor) Restrictiveness: Match Pairing (Change 1 factor) Consistency: Rank pairing Restrictiveness: Intersection

slide-31
SLIDE 31

Consistency versus Restrictiveness

31

  • Models trained to guarantee
  • nly consistency or

restrictiveness of one factor

  • Strong correlation of

consistency vs restrictiveness

slide-32
SLIDE 32

Digression: Style-Content Disentanglement

32

z y x

Observed class label Unobserved style Style Content Only content-consistency is guaranteed Style-content disentanglement not guaranteed (but due to neural net magic)

slide-33
SLIDE 33

Full Disentanglement

33

slide-34
SLIDE 34

Full Disentanglement: Visualizations

34

  • Visualize multiple rows of

single-factor ablation

  • Check for consistency and

restrictiveness Elevation Azimuth

slide-35
SLIDE 35

Full Disentanglement: Visualizations

35

  • Visualize multiple rows of

single-factor ablation

  • Check for consistency and

restrictiveness Ground truth factors: floor color, wall color, object color, object size, object type, and azimuth.

slide-36
SLIDE 36

Full Disentanglement: Visualizations

36

  • Visualize multiple rows of

single-factor ablation

  • Check for consistency and

restrictiveness Ground truth factor: object size Ground truth factor: wall color

slide-37
SLIDE 37

Conclusions

37

  • Definitions for disentanglement
  • A calculus of disentanglement
  • Analyzed weak supervision methods
  • Demonstrated guarantees empirically
slide-38
SLIDE 38

Conclusions

  • Definitions for disentanglement
  • A calculus of disentanglement
  • Analyzed weak supervision methods
  • Demonstrated guarantees empirically

38

  • Better definitions?
  • Do new definitions preserve calculus?
  • Analyze other weak supervision methods?
  • Cost of weak supervision in real world?
slide-39
SLIDE 39

Assumption: X → S is deterministic

Blue sky Pink wall Small purple ball Green floor

39

slide-40
SLIDE 40

Questions?

40

Entangled Disentangled

ruishu@stanford.edu @_smileyball @smiley._.ball