Einsum Networks Fast and Scalable Learning ofTractable Probabilistic - PowerPoint PPT Presentation

Einsum Networks Fast and Scalable Learning ofTractable Probabilistic Circuits Robert Peharz Steven Lang Antonio Vergari Eindhoven University of Technology Technical University of Darmstadt University of California, Los Angeles Karl Stelzner Alejandro Molina Martin Trapp Technical University of Darmstadt Technical University of Darmstadt Graz University of Technology Guy Van den Broeck Kristian Kersting Zoubin Ghahramani University of Cambridge; Uber AI Labs University of California, Los Angeles Technical University of Darmstadt International Conference on Machine Learning (ICML), July 2020

In This Paper Probabilistic Circuits (PCs) — Just a special type of neural network Yet, they are slow Computational graphs highly sparse and cluttered Operations implemented in the log-domain ∼ 50 times slower than neural net of comparable size We propose Einsum Networks (EiNets) PC architecture using a few monolithic einsum operations Run and train PCs up to two orders of magnitude faster Scale PCs to datasets previously out of reach (CelebA, SVHN) 2 /21

Probabilistic Circuits

Probabilistic Circuits Computational graph containing 3 types of operations: Distributions (leaves), products, and weighted sums. 4 /21

Probabilistic Circuits Computational graph containing 3 types of operations: Distributions (leaves) , products, and weighted sums. 4 /21

Probabilistic Circuits Computational graph containing 3 types of operations: Distributions (leaves), products , and weighted sums. 4 /21

Probabilistic Circuits Computational graph containing 3 types of operations: Distributions (leaves), products, and weighted sums . 4 /21

Probabilistic Circuits — Leaf Distributions 5 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … 6 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … x 6 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … p ( x ) x 6 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … p ( x ) θ x 6 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … p ( x ) h ( x ) exp( θ T T ( x ) − A ( θ )) x 6 /21

Probabilistic Circuits — Leaf Distributions Arbitrary probability function (pdf, pmf, mixed) over some set of random variables X . Should facilitate tractable inference routines, e.g. marginalization, conditioning, MAP, … p ( x ) h ( x ) exp( θ T T ( x ) − A ( θ )) Gaussian, Bernoulli, Dirichlet, Poisson, Gamma, … x 6 /21

Probabilistic Circuits — Products 7 /21

Probabilistic Circuits — Products Simply product units 8 /21

Probabilistic Circuits — Products Simply product units 0 . 5 1 . 4 8 /21

Probabilistic Circuits — Products Simply product units 0 . 7 0 . 5 1 . 4 8 /21

Probabilistic Circuits — Sums 9 /21

Probabilistic Circuits — Sums Weighted sums 10 /21

Probabilistic Circuits — Sums Weighted sums w 1 w 2 10 /21

Probabilistic Circuits — Sums Weighted sums w 1 w 2 3 . 14 42 . 10 /21

Probabilistic Circuits — Sums Weighted sums w 1 3 . 14 + w 2 42 . w 1 w 2 3 . 14 42 . 10 /21

Probabilistic Circuits — Sums Weighted sums w 1 3 . 14 + w 2 42 . w k ≥ 0 w 1 w 2 3 . 14 42 . 10 /21

Probabilistic Circuits — Sums Weighted sums w 1 3 . 14 + w 2 42 . w k ≥ 0 ∑ k w k = 1 w 1 w 2 3 . 14 42 . 10 /21

Plus: Structural properties! Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. 11 /21

Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. Plus: Structural properties! 11 /21

Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. Plus: Structural properties! X 3 X 1 X 2 11 /21

Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. Plus: Structural properties! =: p ( X 1 , X 2 , X 3 ) X 3 X 1 X 2 11 /21

Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. Plus: Structural properties! =: p ( X 1 , X 2 , X 3 ) Smoothness sum children have same scope X 3 X 1 X 2 11 /21

Probabilistic Circuits Computational graph containing distributions, products, and weighted sums. Plus: Structural properties! =: p ( X 1 , X 2 , X 3 ) Smoothness sum children have same scope Decomposability product children have disjoint scope X 3 X 1 X 2 11 /21

Smoothness and decomposability Single bottom up pass! Check out our AAAI tutorial on Probabilistic Circuits! Upcoming tutorials at ECAI, ECML/PKDD, IJCAI ! Probabilistic Circuits — Inference Example: Marginalization and Conditioning X = X q ∪ X m ∪ X e ∫ p ( X q , x ′ m , x e )d x ′ m p ( X q | x e ) = ∫ ∫ p ( x ′ m , x e )d x ′ q d x ′ q , x ′ m 12 /21

Check out our AAAI tutorial on Probabilistic Circuits! Upcoming tutorials at ECAI, ECML/PKDD, IJCAI ! Probabilistic Circuits — Inference Example: Marginalization and Conditioning X = X q ∪ X m ∪ X e ∫ p ( X q , x ′ m , x e )d x ′ m p ( X q | x e ) = ∫ ∫ p ( x ′ m , x e )d x ′ q d x ′ q , x ′ m Smoothness and decomposability ⇒ Single bottom up pass! 12 /21

Probabilistic Circuits — Inference Example: Marginalization and Conditioning X = X q ∪ X m ∪ X e ∫ p ( X q , x ′ m , x e )d x ′ m p ( X q | x e ) = ∫ ∫ p ( x ′ m , x e )d x ′ q d x ′ q , x ′ m Smoothness and decomposability ⇒ Single bottom up pass! Check out our AAAI tutorial on Probabilistic Circuits! Upcoming tutorials at ECAI, ECML/PKDD, IJCAI ! 12 /21

The Problem

Einsum Networks

Step I – Vectorize Nodes 15 /21

single einsum -operation Step II – The Basic Einsum Operation 16 /21

Step II – The Basic Einsum Operation single einsum -operation S k = W kij N i N ′ j 16 /21

single einsum -operation Step III – Einsum Layers 17 /21

Step III – Einsum Layers S lk = W lkij N li N ′ single einsum -operation lj 17 /21

Results

Runtime and Memory Comparison K D (depth) R (# replicas) 40 10 1 EiNets (x) 10 1 SPFlow (+) GPU memory (GB) LibSPN (*) 30 10 0 10 0 10 0 20 10 −1 10 −1 10 10 −1 10 −2 10 −2 10 0 10 1 10 −1 10 0 10 1 10 2 10 −1 10 0 10 1 10 2 Training time (sec/epoch) Training time (sec/epoch) Training time (sec/epoch) 0 19 /21

Generative Image Models 20 /21

https://github.com/SPFlow/SPFlow https://github.com/cambridge-mlg/EinsumNetworks Conclusion PCs: intersection of classical graphical models and neural networks. Crucial advantage: many exact inference routines. But, they used to be painful to scale. In this paper, we made a big step to close the gap. More to come! 21 /21

Einsum Networks Fast and Scalable Learning ofTractable Probabilistic - PowerPoint PPT Presentation

Einsum Networks Fast and Scalable Learning ofTractable Probabilistic Circuits Robert Peharz Steven Lang Antonio Vergari Eindhoven University of Technology Technical University of Darmstadt University of California, Los Angeles Karl Stelzner

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

Types of networks (social networks, computer networks, entity- relationship networks, )

Computer Networks I Computer Networks I Networks A networks connection structure is known as

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobility and cellular networks Mobility and cellular networks Cellular radio and PCS networks

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Chapter 1 Communication Networks and Services Networks and Services Network Architecture and

Regional Networks Regional Networks Rural Creative Placemaking Summit Regional

Core Models of Complex Networks Principles of Complex Systems Generalized random networks

ECEN 5032 Data Networks Wireless Networks Peter Mathys mathys@colorado.edu University of

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

VACCINE NETWORKS VACCINE NETWORKS EXAMINING ACUTE AND PERPETUAL NETWORKS AND EXAMINING ACUTE AND

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Numerical algorithms in control Numerical rank revealing Eigenvalues and singular values

Attitudes and behavior Please indicate the extent to which you agree with the Please indicate

Chapter 12. Dynamic Programming Neural Networks and Learning Machines (Haykin) Lecture Notes on

using vim at work @danishprakash danish prakash software engineer @HackerRank

Agreement and Disagreement in a Non-Classical World Adam Brandenburger, Patricia

Several approaches to conditional probability Mirko Navara Center for Machine Perception

a APPLICATIONS OF TEMPERATURE SENSORS I Monitoring N Portable Equipment N CPU Temperature N

Introduction to Effectus Theory Background A crash course on effect algebras and effect modules

Einsum Networks Fast and Scalable Learning ofTractable Probabilistic - PowerPoint PPT Presentation

Einsum Networks Fast and Scalable Learning ofTractable Probabilistic Circuits Robert Peharz Steven Lang Antonio Vergari Eindhoven University of Technology Technical University of Darmstadt University of California, Los Angeles Karl Stelzner

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Mobile Communications Ad-Hoc Networks &amp; Wireless Sensor Networks Ad-hoc networks

Outline Applications of Random Networks Random Networks Applications of Random Networks

Types of networks (social networks, computer networks, entity- relationship networks, )

Computer Networks I Computer Networks I Networks A networks connection structure is known as

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Mobility and cellular networks Mobility and cellular networks Cellular radio and PCS networks

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Chapter 1 Communication Networks and Services Networks and Services Network Architecture and

Regional Networks Regional Networks Rural Creative Placemaking Summit Regional

Core Models of Complex Networks Principles of Complex Systems Generalized random networks

ECEN 5032 Data Networks Wireless Networks Peter Mathys mathys@colorado.edu University of

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

VACCINE NETWORKS VACCINE NETWORKS EXAMINING ACUTE AND PERPETUAL NETWORKS AND EXAMINING ACUTE AND

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Numerical algorithms in control Numerical rank revealing Eigenvalues and singular values

Attitudes and behavior Please indicate the extent to which you agree with the Please indicate

Chapter 12. Dynamic Programming Neural Networks and Learning Machines (Haykin) Lecture Notes on

using vim at work @danishprakash danish prakash software engineer @HackerRank

Agreement and Disagreement in a Non-Classical World Adam Brandenburger, Patricia

Several approaches to conditional probability Mirko Navara Center for Machine Perception

a APPLICATIONS OF TEMPERATURE SENSORS I Monitoring N Portable Equipment N CPU Temperature N

Introduction to Effectus Theory Background A crash course on effect algebras and effect modules

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks