Approximate Counting in SMT and Value Estimation for Probabilistic - - PowerPoint PPT Presentation
Approximate Counting in SMT and Value Estimation for Probabilistic - - PowerPoint PPT Presentation
Approximate Counting in SMT and Value Estimation for Probabilistic Programs Dmitry Chistikov Rayna Dimitrova Rupak Majumdar Max Planck Institute for Software Systems (MPI-SWS) Kaiserslautern and Saarbr ucken, Germany TACAS 2015 April 15,
#SMT: Quickstart
SAT
2/37
#SMT: Quickstart
SAT #SAT Add counting
2/37
#SMT: Quickstart
SAT #SAT Add counting SMT Add theories
2/37
#SMT: Quickstart
SAT #SAT Add counting SMT Add theories #SMT
2/37
#SMT: Quickstart
SAT #SAT Add counting SMT Add theories #SMT NP #P PH ⊆ P#P [Toda, FOCS’89]
2/37
Our contributions
- 1. Simple logical framework for #SMT problems
- 2. Approximate #SMT via reduction to black-box SMT
(extend from #SAT)
◮ Bounded integer arithmetic ◮ Linear real arithmetic
- 3. An application: value estimation
for small probabilistic programs with nondeterminism
3/37
Approximate #SMT via reduction to SMT
Idea: Use known SMT techniques as a black box and reduce approximate counting to decision problem Running time: Polynomial (randomized), but with queries to SMT oracle Output: Approximate, but with user-given precision
4/37
Our contributions
- 1. Simple logical framework for #SMT problems
- 2. Approximate #SMT via reduction to black-box SMT
(extend from #SAT)
◮ Bounded integer arithmetic ◮ Linear real arithmetic
- 3. An application: value estimation
for small probabilistic programs with nondeterminism
5/37
Outline
- 1. Example: Probabilistic programs
- 2. #SMT: Logical framework
- 3. Approximate #SMT via reduction to SMT
- 4. Discussion and further directions
6/37
What are probabilistic programs?
Probabilistic programs are a way to express probability distributions. This talk: Imperative, loop-free programs with:
◮ Coin-flipping (uniform distributions) ◮ Bayesian reasoning ◮ Nondeterminism
7/37
Value estimation problem
Input: probabilistic program Output: value of the program Value of program = Pr(Accept | Accept or Reject) = Pr(Accept) Pr(Accept or Reject)
8/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
- I. Player chooses door i ∈ {1, 2, 3}.
- II. Host opens door j = i with goat.
- III. Should the player change her choice?
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
- I. Player chooses door i ∈ {1, 2, 3}.
- II. Host opens door j = i with goat.
- III. Should the player change her choice?
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
- I. Player chooses door i ∈ {1, 2, 3}.
- II. Host opens door j = i with goat.
- III. Should the player change her choice?
9/37
Example: The Monty Hall problem
[Selvin, American Statistician (1975)]
1 2 3
- I. Player chooses door i ∈ {1, 2, 3}.
- II. Host opens door j = i with goat.
- III. Should the player change her choice?
9/37
Example: A probabilistic program for Monty Hall
a-proc:mh c ∼ Uniform({1, 2, 3}) /* position of car */ i := 1 /* initial choice of player */ choice: /* host opens door j with goat */ case: j := 2; assume(j = c) ; case: j := 3; assume(j = c) ; if i = c then accept; else reject; /* player switches from door i */
10/37
Example: A probabilistic program for Monty Hall
a-proc:mh c ∼ Uniform({1, 2, 3}) /* position of car */ i := 1 /* initial choice of player */ choice: /* host opens door j with goat */ case: j := 2; assume(j = c) ; case: j := 3; assume(j = c) ; if i = c then accept; else reject; /* player switches from door i */ c = 1 c = 2 c = 3 (with Pr = 1/3 each) j := 2 reject
×
accept j := 3 reject accept
×
Reject Accept Accept
10/37
Example: A probabilistic program for Monty Hall
a-proc:mh c ∼ Uniform({1, 2, 3}) /* position of car */ i := 1 /* initial choice of player */ choice: /* host opens door j with goat */ case: j := 2; assume(j = c) ; case: j := 3; assume(j = c) ; if i = c then accept; else reject; /* player switches from door i */ c = 1 c = 2 c = 3 (with Pr = 1/3 each) j := 2 reject
×
accept j := 3 reject accept
×
Reject Accept Accept
10/37
Example: A probabilistic program for Monty Hall
a-proc:mh c ∼ Uniform({1, 2, 3}) /* position of car */ i := 1 /* initial choice of player */ choice: /* host opens door j with goat */ case: j := 2; assume(j = c) ; case: j := 3; assume(j = c) ; if i = c then accept; else reject; /* player switches from door i */ c = 1 c = 2 c = 3 (with Pr = 1/3 each) j := 2 reject
×
accept j := 3 reject accept
×
Reject Accept Accept Value of program = Pr(Accept) Pr(Accept or Reject) = 2/3
10/37
Value estimation problem
Input: probabilistic program Output: value of the program Value of program = Pr(Accept | Accept or Reject) = Pr(Accept) Pr(Accept or Reject)
11/37
Value estimation reduces to #SMT
Program Formulae ϕacc, ϕacc ∨ rej Accept = ϕacc Accept or Reject = ϕacc ∨ rej val(Program) = mc(ϕacc) mc(ϕacc ∨ rej)
12/37
Value estimation reduces to #SMT
[Sankaranarayanan et al., PLDI’13]
Program Formulae ϕacc, ϕacc ∨ rej Accept = ϕacc Accept or Reject = ϕacc ∨ rej val(Program) = mc(ϕacc) mc(ϕacc ∨ rej) Our work: 2 calls to #SMT
12/37
Outline
- 1. Example: Probabilistic programs
- 2. #SMT: Logical framework
- 3. Approximate #SMT via reduction to SMT
- 4. Discussion and further directions
13/37
Logical theories
Logical theory T with fixed interpretation:
◮ formula: ϕ(x1 . . . , xk), variables with domain D ◮ model: (a1, . . . , ak) ∈ Dk such that ϕ(a1, . . . , ak) is true ◮ satisfiability problem: ϕ = ∅ ? ◮ model counting problem: ?
14/37
Measures
The domain D is a measure space: it comes with
◮ σ-algebra F ⊆ 2D
◮ ∅ ∈ F, F is closed under complement and countable ∪
◮ measure µ : F → R
◮ µ is non-negative, µ(∅) = 0 and µ is σ-additive
Lift µ to Dk: µ(A1 × . . . × Ak) = µ(A1) . . . µ(Ak)
15/37
Measured theories
T is measured iff every ϕ is measurable. The model count of ϕ is mc(ϕ) = µ(ϕ).
16/37
Measured theories: Examples
Theory Domain Connectives Quantifiers mc(ϕ) Boolean satisfiability {0, 1} ∧, ∨, ¬ None Number of satisfying assignments Integer arithmetic Z ∩ [a, b] ∧, ∨, ¬ ∃ Number of models Linear programming R ∩ [a, b] ∧ None Volume of polytope Linear real arithmetic R ∩ [a, b] ∧, ∨, ¬ ∃ Volume
17/37
Monty Hall in formulas
◮ Random variable c ∈ {1, 2, 3} ◮ Nondeterministic variables
◮ i, j ∈ {1, 2, 3} ◮ b1, . . . , b5, binit, bacc, brej ∈ {0, 1} (for program locations)
◮ Formula ϕacc(c): there exists an execution of the program
that reaches accept
◮ Formula ϕacc ∨ rej(c): there exists an execution of the program
that reaches accept or reject
18/37
Monty Hall in formulas
◮ Random variable c ∈ {1, 2, 3} ◮ Nondeterministic variables
◮ i, j ∈ {1, 2, 3} ◮ b1, . . . , b5, binit, bacc, brej ∈ {0, 1} (for program locations)
◮ Formula ϕacc(c):
∃ i ∃ j ∃ b1 . . . ∃ b5 ∃ binit ∃ bacc ∃ brej (bacc → (i = c ∧ b5)) ∧ (brej → (i = c ∧ b5)) ∧ (b5 → ((j = c ∧ b3) ∨ (j = c ∧ b4))) ∧ (b4 → (j = 3 ∧ b2)) ∧ (b3 → (j = 2 ∧ b2)) ∧ (b2 → (i = 1 ∧ b1)) ∧ (b1 → binit) ∧ binit ∧ bacc
◮ Formula ϕacc ∨ rej(c): replace last bacc with (bacc ∨ brej)
18/37
Monty Hall in formulas
◮ Random variable c ∈ {1, 2, 3} ◮ Nondeterministic variables
◮ i, j ∈ {1, 2, 3} ◮ b1, . . . , b5, binit, bacc, brej ∈ {0, 1} (for program locations)
◮ Formula ϕacc(c):
∃ i ∃ j ∃ b1 . . . ∃ b5 ∃ binit ∃ bacc ∃ brej (bacc → (i = c ∧ b5)) ∧ (brej → (i = c ∧ b5)) ∧ (b5 → ((j = c ∧ b3) ∨ (j = c ∧ b4))) ∧ (b4 → (j = 3 ∧ b2)) ∧ (b3 → (j = 2 ∧ b2)) ∧ (b2 → (i = 1 ∧ b1)) ∧ (b1 → binit) ∧ binit ∧ bacc
◮ Formula ϕacc ∨ rej(c): replace last bacc with (bacc ∨ brej)
18/37
Monty Hall in formulas
◮ Random variable c ∈ {1, 2, 3} ◮ Nondeterministic variables
◮ i, j ∈ {1, 2, 3} ◮ b1, . . . , b5, binit, bacc, brej ∈ {0, 1} (for program locations)
◮ Formula ϕacc(c):
∃ i ∃ j ∃ b1 . . . ∃ b5 ∃ binit ∃ bacc ∃ brej (bacc → (i = c ∧ b5)) ∧ (brej → (i = c ∧ b5)) ∧ (b5 → ((j = c ∧ b3) ∨ (j = c ∧ b4))) ∧ (b4 → (j = 3 ∧ b2)) ∧ (b3 → (j = 2 ∧ b2)) ∧ (b2 → (i = 1 ∧ b1)) ∧ (b1 → binit) ∧ binit ∧ bacc
◮ Formula ϕacc ∨ rej(c): replace last bacc with (bacc ∨ brej)
18/37
Monty Hall in formulas
◮ Random variable c ∈ {1, 2, 3} ◮ Nondeterministic variables
◮ i, j ∈ {1, 2, 3} ◮ b1, . . . , b5, binit, bacc, brej ∈ {0, 1} (for program locations)
◮ Formula ϕacc(c):
∃ i ∃ j ∃ b1 . . . ∃ b5 ∃ binit ∃ bacc ∃ brej (bacc → (i = c ∧ b5)) ∧ (brej → (i = c ∧ b5)) ∧ (b5 → ((j = c ∧ b3) ∨ (j = c ∧ b4))) ∧ (b4 → (j = 3 ∧ b2)) ∧ (b3 → (j = 2 ∧ b2)) ∧ (b2 → (i = 1 ∧ b1)) ∧ (b1 → binit) ∧ binit ∧ bacc
◮ Formula ϕacc ∨ rej(c): replace last bacc with (bacc ∨ brej)
18/37
#SMT: Model counting
Input: formula ϕ in a measured theory T Output: mc(ϕ)
19/37
Outline
- 1. Example: Probabilistic programs
- 2. #SMT: Logical framework
- 3. Approximate #SMT via reduction to SMT
- 4. Discussion and further directions
20/37
Randomized approximate algorithms
◮ With multiplicative error:
estimate is within (1 + ε)-factor of true value with probability at least 3/4
◮ With additive error:
estimate is within ±γ · Scale of true value with probability at least 3/4
21/37
Randomized approximate algorithms
◮ With multiplicative error:
estimate is within (1 + ε)-factor of true value with probability at least 3/4
◮ With additive error:
estimate is within ±γ · Scale of true value with probability at least 3/4
21/37
Hashing approach for approximate #SAT
[Jerrum, Valiant, Vazirani 1986]
ϕ(x) = ϕ(x1, . . . , xk) Boolean formula mc(ϕ) = ? Idea:
- 1. Take an appropriate hash function h: {0, 1}k → {0, 1}m.
- 2. Take ψ(x) = ϕ(x) ∧ (h(x) = 0).
- 3. On expectation, mc(ψ) = mc(ϕ)/2m.
- 4. ψ is satisfiable with high probability if mc(ϕ) ≫ 2m.
22/37
Hashing approach for #P problems
Theorem [Jerrum, Valiant, Vazirani 1986]
23/37
Hashing approach for #P problems
Theorem [Jerrum, Valiant, Vazirani 1986] approximate #P
23/37
Hashing approach for #P problems
Theorem [Jerrum, Valiant, Vazirani 1986] approximate #P BPP
23/37
Hashing approach for #P problems
Theorem [Jerrum, Valiant, Vazirani 1986] approximate #P BPPNP
23/37
Hashing approach for #P problems
Theorem [Jerrum, Valiant, Vazirani 1986] approximate #P ⊆ BPPNP
23/37
Approximate #SMT for integer arithmetic
Example: Monty Hall c ∈ {1, 2, 3} probabilistic variable; ϕacc(c) ≡ (c = 1) Hash function: h(x) = A · x + b, coefficients from {0, 1} u.a.r. Queries to SMT solver: ϕacc(c) (in integer variables) ∧ (x = bin(c)) (binary encoding) ∧ (A · x + b = 0) (hashing into m bits)
24/37
Approximate #SMT for integer arithmetic
Example: Monty Hall (typical run) Error reduction: repeat 62 times, take majority vote Dimension of hash Satisfiable Unsatisfiable Majority vote
- 0. . . 6
62 Sat
- 7. . . 9
61 1 Sat 10 55 7 Sat 11 50 12 Sat 12 48 14 Sat 13 21 41 Unsat With probability ≥ 0.99, mc(ϕacc) ∈ [1.73, 2.45]. Since the count is integer, mc(ϕacc) = 2. Similarly, mc(ϕacc ∨ rej) = 3 and val(Switch) = 2/3.
25/37
Approximate #SMT: from integers to reals
Discretization:
◮ Partition the domain [a, b]k into cubes ◮ Overapproximate the body with the cubes it intersects
Complexity-theoretic point of view:
◮ Reduce to a #P problem
26/37
Approximate #SMT: from integers to reals
Approximation error: total volume of cut cubes Formula size: log(number of all cubes) Example: Variables x, y ∈ [0, 4] ⊆ R x ≤ 4 y ≥ 1 x − y ≥
y x 1 2 3 4 1 2 3 4
16 cubes 4 cut cubes
27/37
Approximate #SMT: from integers to reals
Approximation error: total volume of cut cubes Formula size: log(number of all cubes) Theorem [Dyer, Frieze 1988] Approximate volume computation (#SMT) for polytopes reduces to #P. Limitation: applicable only to quantifier-free formulas Value estimation: Formulas contain existential quantifiers
27/37
Approximate #SMT for linear real arithmetic
Input: ϕ(x) = ∃ z. Φ(x, z) Output: approximation of mc(ϕ) Example: Variables x, y ∈ [0, 4] ⊆ R, z ∈ R x ≤ 4 y ≥ 1 x − y ≥ x + y − z ≥ z ≥ 4 Projection on (x, y):
y x 1 2 3 4 1 2 3 4
16 cubes 8 cut cubes
28/37
Approximate #SMT for linear real arithmetic
Input: ϕ(x) = ∃ z. Φ(x, z) Output: approximation of mc(ϕ)
Lemma
Number of cutting hyperplanes is at most 2l, where l is the number of atomic predicates in Φ.
Corollary
Number of cubes increases by an exponential factor, number of bit variables increases by a polynomial.
28/37
Summary: Approximate #SMT via reduction to SMT
Theorem
#SMT for bounded integer arithmetic (IA) can be approximated with a multiplicative error by a polynomial-time randomized algorithm that has oracle access to satisfiability of formulas in IA.
Theorem
#SMT for linear real arithmetic (RA) can be approximated with an additive error by a polynomial-time randomized algorithm that has oracle access to satisfiability of formulas in IA + RA.
29/37
Evaluation: probabilistic programs
Solve value estimation for programs over integers: compute approximations of mc(ϕacc) and mc(ϕacc ∨ rej)
- 1. Probabilistic programs with nondeterminism
◮ the Monty Hall problem (1) ◮ the three prisoners problem (2)
- 2. Classical Bayesian network examples
◮ Pearl’s burglar alarm (3) ◮ Wet grass model (4)
- 3. Approximated probabilistic distributions
◮ A medical diagnosis system with simplified input distributions
and manually discretized domains (5)
30/37
Evaluation: results
formula ε p m time(s) (1) ϕacc 0.2 0.01 13 3.37 ϕacc ∨ rej 0.2 0.01 20 4.11 (2) ϕacc 0.5 0.1 0.04 ϕacc ∨ rej 0.5 0.1 20 19.84 (3) ϕacc 0.5 0.1 36 196.54 ϕacc ∨ rej 0.5 0.1 49 132.53 (4) ϕacc 0.5 0.1 34 85.71 ϕacc ∨ rej 0.5 0.1 35 89.37 (5) ϕacc 0.5 0.1 56 295.09 ϕacc ∨ rej 0.5 0.1 57 241.55 1 + ε: multiplicative approximation factor p: error probability m: maximal hash size Further direction: Improved SMT reasoning for XOR (hashing)
31/37
Outline
- 1. Example: Probabilistic programs
- 2. #SMT: Logical framework
- 3. Approximate #SMT via reduction to SMT
- 4. Discussion and further directions
32/37
Discussion: Monte Carlo simulation
Monte Carlo simulation
+ handles complex probabilistic distributions − requires specialized heuristics − sacrifices theoretical guarantees + performs well when probability is non-vanishing
Our approach
− handles only uniform distributions + uses off-the-shelf SMT solver + provides theoretical guarantees + performs well when probability is vanishing
33/37
Discussion: Monte Carlo simulation
Monte Carlo simulation
+ handles complex probabilistic distributions − requires specialized heuristics − sacrifices theoretical guarantees + performs well when probability is non-vanishing
Our approach
− handles only uniform distributions + uses off-the-shelf SMT solver + provides theoretical guarantees + performs well when probability is vanishing
33/37
Discussion: Monte Carlo bottleneck
Probability p small Error threshold δ a multiple of p Must sample Ω( 1
p) times!
More formally: tightness of Chernoff bounds.
34/37
Discussion: Monte Carlo simulation
Monte Carlo simulation
+ handles complex probabilistic distributions − requires specialized heuristics − sacrifices theoretical guarantees + performs well when probability is non-vanishing
Our approach
− handles only uniform distributions + uses off-the-shelf SMT solver + provides theoretical guarantees + performs well when probability is vanishing
35/37
Further directions
- 1. Improved XOR reasoning and scalability
[CryptoMiniSat / Soos et al., SAT’09]
- 2. Other logical theories
[Luu et al., PLDI’14]
- 3. Non-uniform distributions
[Sankaranarayanan et al., PLDI’13]
- 4. Uniform generation of program behaviors
[Chakraborty et al., CAV’13]
- 5. Other applications
[Fredrikson, Jha, LICS’14]
36/37
Our contributions
- 1. Simple framework for #SMT problems
- 2. Approximate #SMT via reduction to black-box SMT
(extend from #SAT)
◮ Bounded integer arithmetic: preserving the formula structure ◮ Linear real arithmetic: with a new projection lemma
- 3. An application: value estimation
for small probabilistic programs with nondeterminism Thank you!
37/37