Probabilistic Programs Guy Van den Broeck StarAI Workshop @ AAAI, - PowerPoint PPT Presentation

Computer Science Querying Advanced Probabilistic Models: From Relational Embeddings to Probabilistic Programs Guy Van den Broeck StarAI Workshop @ AAAI, Feb 7, 2020

The AI Dilemma Pure Learning Pure Logic

The AI Dilemma Pure Learning Pure Logic • Slow thinking: deliberative, cognitive, model-based, extrapolation • Amazing achievements until this day • “ Pure logic is brittle ” noise, uncertainty, incomplete knowledge, …

The AI Dilemma Pure Learning Pure Logic • Fast thinking: instinctive, perceptive, model-free, interpolation • Amazing achievements recently • “ Pure learning is brittle ” bias, algorithmic fairness, interpretability, explainability, adversarial attacks, unknown unknowns, calibration, verification, missing features, missing labels, data efficiency, shift in distribution, general robustness and safety fails to incorporate a sensible model of the world

The FALSE AI Dilemma So all hope is lost? Probabilistic World Models • Joint distribution P(X) • Wealth of representations: can be causal, relational, etc. • Knowledge + data • Reasoning + learning

Probabilistic World Models Pure Learning Pure Logic A New Synthesis of Learning and Reasoning Tutorial on Probabilistic Circuits This afternoon: 2pm-6pm Sutton Center, 2nd floor

Probabilistic World Models Pure Learning Pure Logic High-Level Probabilistic Representations 1 Probabilistic Databases Meets Relational Embeddings: Symbolic Querying of Vector Spaces 2 Modular Exact Inference for Discrete Probabilistic Programs

What we’d like to do…

What we’d like to do… ∃ x Coauthor(Einstein,x) ∧ Coauthor(Erdos,x)

Einstein is in the Knowledge Graph

Erdős is in the Knowledge Graph

This guy is in the Knowledge Graph … and he published with both Einstein and Erdos!

Desired Query Answer 1. Fuse uncertain information from web Ernst Straus ⇒ Embrace probability! Barack Obama, … 2. Cannot come from labeled data ⇒ Embrace query eval! Justin Bieber , …

Cartoon Motivation ∃ x Coauthor(Einstein,x) ∧ Coauthor(Erdos,x) ? Relational Curate Query Embedding Knowledge in a Vectors Graph DBMS Many exceptions in StarAI and PDB communities, but, we need to embed…

Probabilistic Databases • Probabilistic database Scientist Coauthor x y P x P Erdos Renyi Erdos 0.9 0.6 Einstein Einstein Pauli 0.7 0.8 Pauli Obama Erdos 0.6 0.1 • Learned from the web, large text corpora, ontologies, etc., using statistical machine learning. [VdB&Suciu’17]

Probabilistic Databases Semantics • All possible databases: Ω = *𝜕 1 , … , 𝜕 𝑜 + x y x y x y A B x y A C A B x y x y A C x y B C B C A C A B B C A C • Probabilistic database 𝑄 assigns a probability to each: 𝑄: Ω → ,0,1- • Probabilities sum to 1: 𝑄 𝜕 = 1 𝜕∈Ω [VdB&Suciu’17]

Commercial Break • Survey book http://www.nowpublishers.com/article/Details/DBS-052 • IJCAI 2016 tutorial http://web.cs.ucla.edu/~guyvdb/talks/IJCAI16-tutorial/

How to specify all these numbers? • Only specify marginals: 𝑄 𝐷𝑝𝑏𝑣𝑢ℎ𝑝𝑠 𝐵𝑚𝑗𝑑𝑓, 𝐶𝑝𝑐 = 0.23 Coauthor • Assume tuple-independence x y P A B p 1 A C p 2 x y x y B C p 3 x y A B x y A C x y A B x y A C x y A B p 1 p 2 p 3 B C A C A C A B B C B C B C x y (1-p 1 )p 2 p 3 (1-p 1 )(1-p 2 )(1-p 3 ) [VdB&Suciu’17]

Probabilistic Query Evaluation Q = ∃ x ∃ y Scientist(x) ∧ Coauthor(x,y) P( Q ) = 1- {1- } * p 1 *[ ] 1-(1-q 1 )*(1-q 2 ) {1- } p 2 *[ ] 1-(1-q 3 )*(1-q 4 )*(1-q 5 ) Coauthor x y P A D q 1 Y 1 Scientist x P A E q 2 Y 2 A p 1 X 1 B F q 3 Y 3 B p 2 X 2 B G q 4 Y 4 C p 3 X 3 B H q 5 Y 5

Lifted Inference Rules Preprocess Q (omitted), Then apply rules (some have preconditions) Negation P(¬Q) = 1 – P(Q) P(Q1 ∧ Q2) = P(Q1) P(Q2) Decomposable ∧ , ∨ P(Q1 ∨ Q2) =1 – (1 – P(Q1)) (1 – P(Q2)) P( ∀ z Q ) = Π A ∈ Domain P(Q[A/z]) Decomposable ∃ , ∀ P( ∃ z Q) = 1 – Π A ∈ Domain (1 – P(Q[A/z])) P(Q1 ∧ Q2) = P(Q1) + P(Q2) - P(Q1 ∨ Q2) Inclusion/ P(Q1 ∨ Q2) = P(Q1) + P(Q2) - P(Q1 ∧ Q2) exclusion

Example Query Evaluation Q = ∃ x ∃ y Scientist(x) ∧ Coauthor(x,y) Decomposable ∃ -Rule P(Q) = 1 - Π A ∈ Domain (1 - P(Scientist(A) ∧ ∃ y Coauthor(A,y)) Check independence: Scientist(A) ∧ ∃ y Coauthor(A,y) Scientist(B) ∧ ∃ y Coauthor(B,y) = 1 - (1 - P(Scientist(A) ∧ ∃ y Coauthor(A,y)) x (1 - P(Scientist(B) ∧ ∃ y Coauthor(B,y)) x (1 - P(Scientist(C) ∧ ∃ y Coauthor(C,y)) x (1 - P(Scientist(D) ∧ ∃ y Coauthor(D,y)) x (1 - P(Scientist(E) ∧ ∃ y Coauthor(E,y)) x (1 - P(Scientist(F) ∧ ∃ y Coauthor(F,y)) … Complexity PTIME

Limitations H 0 = ∀ x ∀ y Smoker(x) ∨ Friend(x,y) ∨ Jogger(y) P( ∀ z Q) = Π A ∈ Domain P(Q[A/z]) The decomposable ∀ -rule: … does not apply: Dependent H 0 [Alice/x] and H 0 [Bob/x] are dependent: ∀ y (Smoker(Alice) ∨ Friend(Alice,y) ∨ Jogger(y)) ∀ y (Smoker(Bob) ∨ Friend(Bob,y) ∨ Jogger(y)) Lifted inference sometimes fails.

Are the Lifted Rules Complete? Dichotomy Theorem for Unions of Conjunction Queries / Monotone CNF • If lifted rules succeed, then PTIME query • If lifted rules fail, then query is #P-hard Lifted rules are complete for UCQ! [Dalvi and Suciu;JACM’11]

The Good, Bad, Ugly • We understand querying very well!  – and it is often efficient (a rare property!) – but often also highly intractable  • Tuple-independence is limiting unless reducing from a more expressive model  Can reduce from MLNs but then intractable… • Where do probabilities come from?   An unspecified “statistical model”

Throwing Relational Embedding Models Over the Wall Coauthor • Associate vector with x y S – each relation R A B .6 A C -.1 – each entity A, B, … B C .4 • Score S(head, relation, tail) (based on Euclidian, cosine, …)

Throwing Relational Embedding Models Over the Wall Interpret scores as probabilities High score ~ prob 1 ; Low score ~ prob 0 Coauthor Coauthor x y S x y P A B .6 A B 0.9 A C -.1 A C 0.1 B C .4 B C 0.5

The Good, Bad, Ugly • Where do probabilities come from? We finally know the “statistical model ”!  Both capture marginals: a good match • We still understand querying very well!  but it is often highly intractable  • Tuple-independence is limiting   Relational embedding models do not attempt to capture dependencies in link prediction

A Second Attempt • Let’s simplify drastically! • Assume each relation has the form 𝑆 𝑦, 𝑧 ⇔ 𝑈 𝑆 ∧ 𝐹(𝑦) ∧ 𝐹(𝑧) • That is, there are latent relations – 𝑈 ∗ to decide which relations can be true – 𝐹 to decide which entities participate E T Coauthor x P P x y P ~ , A 0.2 0.2 A B 0.9 B 0.5 A C 0.1 C 0.3 B C 0.5

Can this do link prediction? • Predict Coauthor(Alice,Bob) E Coauthor T x P x y P P ~ , A 0.2 A B ? 0.3 B 0.5 C 0.3 • Rewrite query using 𝑆 𝑦, 𝑧 ⇔ 𝑈 𝑆 ∧ 𝐹(𝑦) ∧ 𝐹(𝑧) • Apply standard lifted inference rules • P(Coauthor(Alice,Bob)) = 0.3 ⋅ 0.2 ⋅ 0.5

The Good, Bad, Ugly • Where do probabilities come from? We finally know the “statistical model ”!  • We still understand querying very well!  By rewriting 𝑆 into 𝐹 and 𝑈 𝑆 , every UCQ query becomes tractable!      • Tuples sharing entities or relation symbols depend one each other • The model is not very expressive 

A Third Attempt • Mixture models of the second attempt 𝑆 𝑦, 𝑧 ⇔ 𝑈 𝑆 ∧ 𝐹(𝑦) ∧ 𝐹(𝑧) Now, there are latent relations 𝑈 𝑆 and 𝐹 for each mixture component • The Good:  – Still a clear statistical model – Every UCQ query is still tractable – Still captures tuple dependencies – Mixture can approximate any distribution

Can this do link prediction? • Predict Coauthor(Alice,Bob) in each mixture component – 𝑄 1 (Coauthor(Alice,Bob)) = 0.3 ⋅ 0.2 ⋅ 0.5 – 𝑄 2 (Coauthor(Alice,Bob)) = 0.9 ⋅ 0.1 ⋅ 0.6 – Etc. • Probability in mixture of d components 𝑄 (Coauthor(Alice,Bob)) = 1 𝑒 0.3 ⋅ 0.2 ⋅ 0.5 + 1 𝑒 0.9 ⋅ 0.1 ⋅ 0.6 + ⋯

How good is this? Does it look familiar? 𝑄 (Coauthor(Alice,Bob)) = 1 𝑒 0.3 ⋅ 0.2 ⋅ 0.5 + 1 𝑒 0.9 ⋅ 0.1 ⋅ 0.6 + ⋯

How good is this? • At link prediction: same as DistMult • At queries on bio dataset [Hamilton] Competitive, while having a consistent underlying distribution Ask Tal at his poster!

How expressive is this? GQE baseline are graph queries translated to linear algebra by Hamilton et al [2018]

First Conclusions • We can give probabilistic database semantics to relational embedding models – Gives more meaningful query results • By doing some solve some annoyances of the theoretical PDB framework – Tuple dependence – Clear connection to learning – While everything stays tractable – And the intractable becomes tractable • Enables much more (train on Q, consistency)

Probabilistic Programs Guy Van den Broeck StarAI Workshop @ AAAI, - PowerPoint PPT Presentation

Computer Science Querying Advanced Probabilistic Models: From Relational Embeddings to Probabilistic Programs Guy Van den Broeck StarAI Workshop @ AAAI, Feb 7, 2020 The AI Dilemma Pure Learning Pure Logic The AI Dilemma Pure Learning

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Multiple Programs How do programs communicate? 1 Multiple Programs How do programs communicate?

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Probabilistic Computation Lecture 13 Understanding BPP 1 Recap 2 Recap Probabilistic

Fine-Grained Semantics for Probabilistic Programs Benjamin Timon Martin Bichsel Gehr Vechev

Probabilistic Morphable Models 2019: Hands-on part Ghazi Bouabene Probabilistic Morphable Models

Probabilistic Team Semantics Probabilistic atoms Connectives and quantifiers Examples Jonni

Prediction and Odds 18.05 Spring 2017 Probabilistic Prediction Also called probabilistic

Probabilistic Forecasting with DeepAR and AWS SageMaker EuroPython 2020 - Probabilistic

Dedicated Storage Assignment (DSAP) The assignment of items to slots is termed slotting

Self-Loop Aggregation Product A New Hybrid Approach to On-the-Fly LTL Model Checking

An HTM-Based Update-side Synchronization for RCU on NUMA systems SeongJae Park, Paul E.

Number-Theoretic Algorithms (RSA and related algorithms) Chapter 31, CLRS book p1. Outline

CMSC 351 Introduction to Probability Theory* Mohammad T. Hajiaghayi University of Maryland *:

rss r r qts rtr

Programming Languages First Class Functions Material adapted from Dan Grossman's PL class, U.

Creative Play Environments IDS combines digital fabrication with hand-crafted sculpture to