Learning Discrete Graphical Models with Neural Networks Andrey - - PowerPoint PPT Presentation

learning discrete graphical models with neural networks
SMART_READER_LITE
LIVE PREVIEW

Learning Discrete Graphical Models with Neural Networks Andrey - - PowerPoint PPT Presentation

Slide 1 Learning Discrete Graphical Models with Neural Networks Andrey Lokhov joint work with Abhijith Jayakumar, Sidhant Misra, Marc Vuffray UNCLASSIFIED Managed by Triad National Security, LLC for the U.S. Department of Energys NNSA


slide-1
SLIDE 1

Slide 1

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Learning Discrete Graphical Models with Neural Networks

Andrey Lokhov joint work with Abhijith Jayakumar, Sidhant Misra, Marc Vuffray

slide-2
SLIDE 2

Slide 2

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Separation property

Graphical Models

Probability distribution ! " has conditional dependency structure according to a given graph

"# "$ "% "& "' "(

"(|("', "#) is independent of ("%, "&, "$)

"# "$ "% "& "' "(

Factorization property ! " ∝ exp 1

2∈2456789

:

2("2)

slide-3
SLIDE 3

Slide 3

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Unsupervised learning task

Graphical Model Learning Informally

  • Observe draws of random vectors !

Dimensions of the problem

  • Learn structure and parameters of

a positive distribution " ! > 0

  • Number of samples: %
  • Number of variables: &
  • Alphabet size: '

(!) ∈ 1, … , ' )

Mutual Information based greedy methods

Bresler (2015) Hamilton, Koehler, Moitra (2017)

Convex optimization based methods

Klivans, Meka (2017) Wu, Sanghavi, Dimakis (2019) Vuffray, Misra, Lokhov (2016, 2018)

Prior work in computationally efficient learning

slide-4
SLIDE 4

Slide 4

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Setting of Graphical Model Learning

! " ∝ exp '

(∈*

+(

∗ -(("()

The model has a parametric form: Basis functions are centered: 0 ∈ 1 2 + − +∗ ≤ 5 2 Prior ℓ8-bound on parameters: +9

∗ 8 = ' (∋9

+(

∗ ≤ <

=

  • Observe random draws of "
  • Recover parameters

'

>?

  • ( "( = 0,
slide-5
SLIDE 5

Slide 5

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Method for Solving the Inverse problem: GRISE

! " ∝ exp '

(∈*

+(

∗ -(("()

Arbitrary parametric form Generalized Regularized Interaction Screening (GRISE) s.t. +0 1 ≤ 3 4 5 +0 = arg min

=>

1 @ '

AB1 C

exp − '

(∈*>

+( -(("(

A)

Local Reconstruction (one neighborhood at a time) Convex Function (with low complexity minimization using entropic descent)

slide-6
SLIDE 6

Slide 6

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

〈Si*(Ji,Hi)〉

〈Si*(Ji*,Hi*)〉 〈Si*(Ji=0,Hi=0)〉

Ji Hi

=1

i i i

Intuition Behind GRISE: Infinite Sample Size Limit

! " ∝ exp '

(∈*

+(

∗ -(("()

01 +1

2→4 01 ∗ +1 = 6 exp

− '

(∈*8

+( -( "(

9

∇;801

∗ +1 ∗ = 0

01

∗ +1 ∗

01

∗ +1

+1

(<)

+1

(=)

slide-7
SLIDE 7

Slide 7

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Theorem for Learning Gibbs Distributions with GRISE

! " − "∗ ≤ & 2 ( = * + ,-. log 2 /&4 (Informal) With high probability, GRISE estimates: with a number of samples: and computational complexity: * + 2. Precise finite sample analysis with proofs: arXiv:1902.00600

slide-8
SLIDE 8

Slide 8

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Complete Basis Function Hierarchies: Monomial Basis Example

!" #" ∈ #%, #%#

', #%# '#", …

Monomial basis functions Binary alphabet # ∈ −1, +1 ,

  • # ∝ exp 2

%∈3

4%

∗#% +

2

(%,')∈89

4%'

∗ #% #% +

2

(%,',")∈8:

4%'"

∗ #% # '#" + ⋯

#% #

'

#" #< #=

slide-9
SLIDE 9

Slide 9

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Complete Basis Function Hierarchies: Monomial Basis Example

! " ∝ exp '

(∈*

+(

∗"( +

'

((,0)∈23

+(0

∗ "( "( +

'

((,0,4)∈25

+(04

∗ "( " 0"4 + ⋯

"( " "4 "7 "8

For 9-wise models, the computational complexity of GRISE is : ; <= . Interaction Screening Loss: > +( = arg min

FG

1 I '

JKL M

exp −"( +( + ' +(0 "( + '

0,4

+(04 "

0"4 + ⋯

slide-10
SLIDE 10

Slide 10

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Neural Net Parametrization of the Partial Energy Function

If Neural Net is expressive enough, the global minima of NN-GRISE loss are interaction screening minima corresponding to recovered local energy Neural Net Interaction Screening Loss: ! "# = arg min

+,

1 . /

012 3

exp −8# ΝΝ(8\8#; "#) Interaction Screening Loss: > ?# = arg min

@,

1 . /

012 3

exp −8# ?# + /

B

?#B 8# + /

B,D

?#BD 8

B8D + ⋯

slide-11
SLIDE 11

Slide 11

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Illustration on a small (! = #$) tractable model of order % = &

NN-GRISE hierarchy contains higher-order polynomials in its hypothesis space

NN-GRISE explores a different basis functions hierarchy, and gets close to the true model with less parameters

slide-12
SLIDE 12

Slide 12

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Comparison of conditional distributions for a larger problem

For p=15, L=6 problem, monomial basis contains 3472 terms, and GRISE becomes intractable

Only order L=4 is practically feasible with GRISE NN basis has less parameters (349) and uses less training samples

slide-13
SLIDE 13

Slide 13

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Structure Learning with NN-GRISE

! "# = arg min

+,

1 . /

012 3

exp −8# ΝΝ(8\8#; "#) + ? "#

(2) 2

Variables @ outside of the neighborhood of A do not influence the output at the interaction screening minima Regularization through penalty on first layer weights

slide-14
SLIDE 14

Slide 14

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Summary

  • GRISE is a convex estimator for learning arbitrary discrete graphical models with

rigorous guarantees, improving upon sampling complexities of previous methods

Efficient Learning of Discrete Graphical Models

  • M. Vuffray, S. Misra, A. Y. Lokhov (2020)
  • NN-GRISE is a computationally efficient non-convex estimator that uses the non-linear

representation power of Neural Nets to exploit sparse basis hierarchies

Learning of Discrete Graphical Models with Neural Networks

Abhijith J., A. Y. Lokhov, S. Misra, M. Vuffray (2020)

  • NN-GRISE can still learn the MRF structure, full energy function representation, and

conditional distributions that can be used for re-sampling from the learned model

slide-15
SLIDE 15

Slide 15

Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

UNCLASSIFIED

Questions?