Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi - - PowerPoint PPT Presentation

neurally guided structure inference
SMART_READER_LITE
LIVE PREVIEW

Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi - - PowerPoint PPT Presentation

Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi Lu*, Jiayuan Mao*, Josh Tenenbaum, and Jiajun Wu (* indicates equal contributions) Structure Inference Structure Inference Data [Kemp et al. 2008] Structure Inference Data


slide-1
SLIDE 1

Neurally-Guided Structure Inference

Sidi Lu*, Jiayuan Mao*, Josh Tenenbaum, and Jiajun Wu (* indicates equal contributions) http://ngsi.csail.mit.edu

slide-2
SLIDE 2

Structure Inference

slide-3
SLIDE 3

Structure Inference

[Kemp et al. 2008]

Data

slide-4
SLIDE 4

Structure Inference

[Kemp et al. 2008]

Data Structure

slide-5
SLIDE 5

Structure Inference

[Kemp et al. 2008]

Data Structure

[LeCun et al. 1998]

slide-6
SLIDE 6

Structure Inference

[Kemp et al. 2008]

Data Structure

[LeCun et al. 1998] [Chen et al. 2016]

Structure Inference

slide-7
SLIDE 7

Matrix Decomposition Models

  • Clustering

MG+G

  • Low-Rank Approximation

GG+G

  • Binary Features

BG+G

  • Random Walk

CG+G

  • Co-Clustering

M(GMT+G)+G

  • Clustered Matrix Decomp.

(MG+G)(GMT+G)+G

  • Binary Matrix Factorization

(BG+G)(GBT+G)+G

  • Dependent GSM

(exp(GG+G)◦G)+G

  • ……
slide-8
SLIDE 8

Structure Inference with Matrix Decomposition Models

[Grosse et al. 2012]

slide-9
SLIDE 9

Structure Inference with Matrix Decomposition Models

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12 [Grosse et al. 2012]

slide-10
SLIDE 10

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12 [Grosse et al. 2012]

slide-11
SLIDE 11

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12 [Grosse et al. 2012]

slide-12
SLIDE 12

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

[Grosse et al. 2012]

slide-13
SLIDE 13

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 Cluster Label 20×5 [Grosse et al. 2012]

slide-14
SLIDE 14

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 Cluster Label 20×5 Cluster Center 5×3 [Grosse et al. 2012]

slide-15
SLIDE 15

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + !

[Grosse et al. 2012]

slide-16
SLIDE 16

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + !

[Grosse et al. 2012]

slide-17
SLIDE 17

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + !

[Grosse et al. 2012]

slide-18
SLIDE 18

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + ! LowRank

[Grosse et al. 2012]

slide-19
SLIDE 19

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + ! LowRank

Cluster Center 5×2 @(2×3) Cluster Noise 20×3 0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 [Grosse et al. 2012]

slide-20
SLIDE 20

Structure Inference with Matrix Decomposition Models

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Cluster

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Structure 1! + ! LowRank

Cluster Label 20×5 Cluster Center 5×2 @(2×3) Cluster Noise 20×3

Structure 1(!! + !) + !

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 [Grosse et al. 2012]

slide-21
SLIDE 21

Structure Inference with Matrix Decomposition Models

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 Cluster Label 20×5 Cluster Center 5×3 Cluster Noise 20×3

Cluster Structure 0

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

Structure 10 + 0

Cluster Label 20×5 Cluster Center 5×2 @(2×3) Cluster Noise 20×3

LowRank Structure 1(00 + 0) + 0

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + −0.24 −0.61 −0.04 0.76 0.01 −04 ⋮ ⋮ ⋮ −0.09 0.90 −1.84 [Grosse et al. 2012]

slide-22
SLIDE 22

Naïve Exhaustive Search

  • Clustering

MG+G

  • Low-Rank Approximation

GG+G

  • Binary Features

BG+G

  • Random Walk

CG+G

  • Co-Clustering

M(GMT+G)+G

  • Clustered Matrix Decomp.

(MG+G)(GMT+G)+G

  • Binary Matrix Factorization

(BG+G)(GBT+G)+G

  • Dependent GSM

(exp(GG+G)◦G)+G

  • ……
slide-23
SLIDE 23

Naïve Exhaustive Search

  • Clustering

MG+G

  • Low-Rank Approximation

GG+G

  • Binary Features

BG+G

  • Random Walk

CG+G

  • Co-Clustering

M(GMT+G)+G

  • Clustered Matrix Decomp.

(MG+G)(GMT+G)+G

  • Binary Matrix Factorization

(BG+G)(GBT+G)+G

  • Dependent GSM

(exp(GG+G)◦G)+G

  • ……

Enumerate Rank Select

slide-24
SLIDE 24

Layer-wise Exhaustive Search

G

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

GG + G MG + G

……

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + 12345 Cluster Label 20×5 Cluster Center 5×3

Structure 6! + !

M(exp(G) ∘ G) + G M(GG+G) + G M(GTM+G) + G

……

Cluster Center 5×2 @(2×3)

Structure 6(!! + !) + !

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + 12345

Enumerate Rank Select

slide-25
SLIDE 25

G

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

GG + G MG + G

……

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + 12345 Cluster Label 20×5 Cluster Center 5×3

Structure 6! + !

M(exp(G) ∘ G) + G M(GG+G) + G M(GTM+G) + G

……

Cluster Center 5×2 @(2×3)

Structure 6(!! + !) + !

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + 12345

Enumerate Rank Select

Key Observation: Each step involves the same sub-problem.

Layer-wise Exhaustive Search

slide-26
SLIDE 26

G

Structure !

Input Matrix 20×3 −1.01 −0.76 1.36 0.48 −1.64 −0.50 ⋮ ⋮ ⋮ −0.24 0.89 −1.12

GG + G MG + G

……

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66 + 12345 Cluster Label 20×5 Cluster Center 5×3

Structure 6! + !

M(GG+G) + G

Cluster Center 5×2 @(2×3)

Structure 6(!! + !) + !

0 0 0 0 1 0 0 0 1 0 ⋮ 0 0 1 0 0 −1.32 +0.04 +2.01 −0.23 +0.08 −0.68 +0.95 +0.52 −0.77 +1.40 1 −0.22 1 −1.30 + 12345

Use Neural Network for Layer-wise Amortized Inference.

Neurally Guided Search

−1.32 +0.04 +2.01 −0.23 −2.33 −0.30 +0.08 −0.68 +0.95 +0.52 −1.27 −0.54 −0.77 +1.40 −1.66

Neural Net Prediction: GG+G

slide-27
SLIDE 27

Experiment – Accuracy

slide-28
SLIDE 28

Experiment – Accuracy

Train

slide-29
SLIDE 29

Experiment – Accuracy

Train Test

slide-30
SLIDE 30

Experiment – Running Time Comparison

slide-31
SLIDE 31

Experiment – Running Time Comparison

Speed Up: 3∼39.

slide-32
SLIDE 32

Conclusion

slide-33
SLIDE 33

Conclusion

  • Combine search-based algorithms and data-driven models.
slide-34
SLIDE 34

Conclusion

  • Combine search-based algorithms and data-driven models.
  • Use neural networks to guide layer-wise search.
slide-35
SLIDE 35

Conclusion

  • Combine search-based algorithms and data-driven models.
  • Use neural networks to guide layer-wise search.
  • Advantage:
  • Accurate: c.f. search-based algorithms.
slide-36
SLIDE 36

Conclusion

  • Combine search-based algorithms and data-driven models.
  • Use neural networks to guide layer-wise search.
  • Advantage:
  • Accurate: c.f. search-based algorithms.
  • Efficient: c.f. data-driven models.
slide-37
SLIDE 37

Conclusion

  • Combine search-based algorithms and data-driven models.
  • Use neural networks to guide layer-wise search.
  • Advantage:
  • Accurate: c.f. search-based algorithms.
  • Efficient: c.f. data-driven models.
  • Combinatorial generalization.
slide-38
SLIDE 38

Conclusion

  • Combine search-based algorithms and data-driven models.
  • Use neural networks to guide layer-wise search.
  • Advantage:
  • Accurate: c.f. search-based algorithms.
  • Efficient: c.f. data-driven models.
  • Combinatorial generalization.

Poster #233

Project Page