Computationally efficient probabilistic inference with noisy - PowerPoint PPT Presentation

Computationally efficient probabilistic inference with noisy threshold models based on a CP tensor decomposition Jirka Vomlel and Petr Tichavsk´ y Institute of Information Theory and Automation (´ UTIA) Academy of Sciences of the Czech Republic

Contents • Motivation

Contents • Motivation • Noisy threshold models

Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables

Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables • Experiments

Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables • Experiments • Conclusions

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models X 3 X 1 X 2 X 4 X 5 X 6 Y 1 Y 2

Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models X 3 X 1 X 2 X 4 X 5 X 6 Y 1 Y 2 Definition (The inference task) Given a subset of observations (e.g. Y 1 and Y 2 ) compute probabilities of diseases (e.g. P ( X i | Y 1 = y 1 , Y 2 = y 2 ) , i = 1, . . . , 6.

Noisy threshold - a generalization of noisy-or X 1 X 2 . . . X k X ′ X ′ X ′ . . . 1 2 k Y

Noisy threshold - a generalization of noisy-or Y takes value 1 if at least ℓ out of k parents take value 1: P ( Y = 1 | X ′ 1 = x ′ 1 , . . . , X ′ k = x ′ k ) X 1 X 2 . . . X k � 1 if x ′ 1 + . . . + x ′ k � ℓ = 0 otherwise. X ′ X ′ X ′ . . . 1 2 k Y

Noisy threshold - a generalization of noisy-or Y takes value 1 if at least ℓ out of k parents take value 1: P ( Y = 1 | X ′ 1 = x ′ 1 , . . . , X ′ k = x ′ k ) X 1 X 2 . . . X k � 1 if x ′ 1 + . . . + x ′ k � ℓ = 0 otherwise. Noise: for i = 1, . . . , k X ′ X ′ X ′ . . . 1 2 k P ( X ′ i = 1 | X i = x i ) � 0 if x i = 0 = otherwise. π i Y

An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 )

An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1

An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0

An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0 = ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) − ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 )

An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0 = ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) − ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ( 1, 1 ) ⊗ k − ( 1, 0 ) ⊗ k =

Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 2 Y X 3 X 4

Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 1 X 2 X 2 Y Y X 3 X 3 X 4 X 4

Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 1 X 2 X 2 Y Y X 3 X 3 X 4 X 4 The total table size is 2 5 = 32.

Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 2 Y X 3 X 4

Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 1 X 2 X 2 Y B Y X 3 X 3 X 4 X 4

Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 1 X 2 X 2 Y B Y X 3 X 3 X 4 X 4 The total table size is 5 · 2 2 = 20.

Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) .

Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2.

Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric.

Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric. Definition (Symmetric rank) Symmetric rank (srank) is the minimum number r such that r � b i · a ⊗ k T ( ℓ , k ) = i i = 1 where for i = 1, . . . , k : • b i ∈ R and • a i are real-valued vectors of length 2.

Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric. Definition (Symmetric rank) Symmetric rank (srank) is the minimum number r such that r � b i · a ⊗ k T ( ℓ , k ) = i i = 1 where for i = 1, . . . , k : • b i ∈ R and • a i are real-valued vectors of length 2. • This decomposition is called Canonical Polyadic (CP) or CANDECOMP-PARAFAC (CP) or tensor rank-one .

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1.

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1.

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2.

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k .

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2.

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2. • An algorithm for CP-decomposition to k factors.

Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2. • An algorithm for CP-decomposition to k factors. • For the noisy threshold the above values represent upper bounds.

Computationally efficient probabilistic inference with noisy - PowerPoint PPT Presentation

Computationally efficient probabilistic inference with noisy threshold models based on a CP tensor decomposition Jirka Vomlel and Petr Tichavsk y Institute of Information Theory and Automation ( UTIA) Academy of Sciences of the Czech

Using Python to Solve Computationally Hard Problems Using Python to Solve Computationally Hard

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Probabilistic Graphical Models Probabilistic Graphical Models MAP inference Siamak Ravanbakhsh

15-780 Graduate Artificial Intelligence: Probabilistic inference J. Zico Kolter (this

On Computational and Probabilistic Inference Rajat Mani Thomas Objectives: Revisiting Bayesian

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

CS325 Artificial Intelligence Ch 14b Probabilistic Inference Cengiz Gnay Spring 2013

t ' ! tractable probabilistic inference meeting ! December 11th 2019 - NeurIPS 2019 , Vancouver

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Computationally Efficient Waveform Design in Spectrally Dense Environment Markus Yli-Niemi &

General Yet Computationally Efficient Aggregation Frameworks Ronald de Haan Institute for Logic,

Why Gaussian and Cauchy Functions Computationally . . . Are Efficient in Filled Function Method:

Overview Computational Tricks Objective Computationally efficient algorithms for quadratic

Computationally efficient transfinite patches with fullness control Pter Salvi, Istvn Kovcs

Detecting Faint Edges in Noisy Images: statistical limits, computationally efficient algorithms

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Climate change and related Climate change and related impacts in the Mediterranean impacts in

Interoperability to Improve Healthcare HIMSS EHR Vendors Association Hugh Zettel Hugh Zettel

Pajek Chase Christopherson, Heriberto Diaz, Jiyuan Guo Pajek overview Pajek is designed to

Communicating Scala Objects Andrew Bate and Gavin Lowe Department of Computer Science University

Lecture 23: Subroutines in C Todays Goals Use multiple files to write a C program

Subroutines and Parameter Passing ECE2893 Lecture 5 ECE2893 Subroutines and Parameter Passing

Devel::NYTProf Perl Source Code Profiler Tim Bunce - July 2009 Screencast available at

Introduction to FORTRAN A Brief Summary of GNU FORTRAN Ashik Iqubal Department of Physics

Computationally efficient probabilistic inference with noisy - PowerPoint PPT Presentation

Computationally efficient probabilistic inference with noisy threshold models based on a CP tensor decomposition Jirka Vomlel and Petr Tichavsk y Institute of Information Theory and Automation ( UTIA) Academy of Sciences of the Czech

Using Python to Solve Computationally Hard Problems Using Python to Solve Computationally Hard

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Probabilistic Graphical Models Probabilistic Graphical Models MAP inference Siamak Ravanbakhsh

15-780 Graduate Artificial Intelligence: Probabilistic inference J. Zico Kolter (this

On Computational and Probabilistic Inference Rajat Mani Thomas Objectives: Revisiting Bayesian

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

CS325 Artificial Intelligence Ch 14b Probabilistic Inference Cengiz Gnay Spring 2013

t ' ! tractable probabilistic inference meeting ! December 11th 2019 - NeurIPS 2019 , Vancouver

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Computationally Efficient Waveform Design in Spectrally Dense Environment Markus Yli-Niemi &amp;

General Yet Computationally Efficient Aggregation Frameworks Ronald de Haan Institute for Logic,

Why Gaussian and Cauchy Functions Computationally . . . Are Efficient in Filled Function Method:

Overview Computational Tricks Objective Computationally efficient algorithms for quadratic

Computationally efficient transfinite patches with fullness control Pter Salvi, Istvn Kovcs

Detecting Faint Edges in Noisy Images: statistical limits, computationally efficient algorithms

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Climate change and related Climate change and related impacts in the Mediterranean impacts in

Interoperability to Improve Healthcare HIMSS EHR Vendors Association Hugh Zettel Hugh Zettel

Pajek Chase Christopherson, Heriberto Diaz, Jiyuan Guo Pajek overview Pajek is designed to

Communicating Scala Objects Andrew Bate and Gavin Lowe Department of Computer Science University

Lecture 23: Subroutines in C Todays Goals Use multiple files to write a C program

Subroutines and Parameter Passing ECE2893 Lecture 5 ECE2893 Subroutines and Parameter Passing

Devel::NYTProf Perl Source Code Profiler Tim Bunce - July 2009 Screencast available at

Introduction to FORTRAN A Brief Summary of GNU FORTRAN Ashik Iqubal Department of Physics

Computationally Efficient Waveform Design in Spectrally Dense Environment Markus Yli-Niemi &