Path Finding under Uncertainty through Probabilistic Inference - PowerPoint PPT Presentation

Path Finding under Uncertainty through Probabilistic Inference David Tolpin, Jan Willem van de Meent, Brooks Paige, Frank Wood University of Oxford June 8th, 2015 Paper: http://arxiv.org/abs/1502.07314 Slides: http://offtopia.net/ctp-pp-slides.pdf

Outline Probabilistic Programming Inference Path Finding and Probabilistic Inference Stochastic Policy Learning Case Study: Canadian Traveller Problem Summary

Intuition Probabilistic program: ◮ A program with random computations. ◮ Distributions are conditioned by ‘observations’. ◮ Values of certain expressions are ‘predicted’ — the output . Can be written in any language (extended by sample and observe ).

Example: Model Selection (let [ ;; Model 1 dist (sample (categorical [[normal 1/4] [gamma 1/4] 2 [uniform-discrete 1/4] 3 [uniform-continuous 1/4]])) 4 a (sample (gamma 1 1)) 5 b (sample (gamma 1 1)) 6 d (dist a b)] 7 8 ;; Observations 9 (observe d 1) 10 (observe d 2) 11 (observe d 4) 12 (observe d 7) 13 14 ;; Explanation 15 (predict :d (type d)) 16 (predict :a a) 17 (predict :b b))) 18

Definition A probabilistic program is a stateful deterministic computation P : ◮ Initially, P expects no arguments. ◮ On every call, P returns ◮ a distribution F , ◮ a distribution and a value ( G , y ), ◮ a value z , ◮ or ⊥ . ◮ Upon returning F , P expects x ∼ F . ◮ Upon returning ⊥ , P terminates. A program is run by calling P repeatedly until termination. The probability of each trace is | x x x | | y y | y � � p P ( x x x ) = ∝ p F i ( x i ) p G j ( y j ) i =1 j =1 .

Inference Objective ◮ Continuously and infinitely generate a sequence of samples drawn from the distribution of the output expression — so that someone else puts it in good use (vague but common).

Inference Objective ◮ Continuously and infinitely generate a sequence of samples drawn from the distribution of the output expression — so that someone else puts it in good use (vague but common). ◮ Approximately compute integral of the form ∞ � Φ = ϕ ( x ) p ( x ) dx −∞

Inference Objective ◮ Continuously and infinitely generate a sequence of samples drawn from the distribution of the output expression — so that someone else puts it in good use (vague but common). ◮ Approximately compute integral of the form ∞ � Φ = ϕ ( x ) p ( x ) dx −∞ ◮ Suggest most probable explanation (MPE) - most likely assignment for all non-evidence variables given evidence. �

[(let [dfreqs (frequencies (map :d predicts))] (plot/bar-chart (map (comp #(str/replace % #"class embang.runtime.(.*)- distribution" "$1") str first) dfreqs) (map second dfreqs) :plot-size 600 :aspect-ratio 4 :y-title "sample count")) (plot/histogram (map :a predicts) :x-title "a" :bins 30 :plot-size 250 :aspect- ratio 1.5 :y-title "sample count") Example: Inference Results (plot/histogram (map :b predicts) :x-title "b" :bins 30 :plot-size 250 :aspect- ratio 1.5)] [ 4,000 3,500 3,000 sample count 2,500 2,000 1,500 1,000 500 0 gamma normal uniform-discrete uniform-continuous 500 1,600 450 1,400 400 1,200 sample count 350 300 1,000 250 800 ] 200 600 150 400 100 200 50 0 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 1 2 3 4 5 6 7 8 9 a b

Connection between MAP and Shortest Path Maximizing the (logarithm of) trace probability | x x x | | y y | y � � log p P ( x x ) = x log p F i ( x i ) + log p G j ( y j ) + C i =1 j =1 corresponds to finding the shortest path in a graph G = ( V , E ): ◮ V = { ( F i , x i ) } ∪ { ( G j , y j ) } . ◮ Edge costs are − log p F i ( x i ) or − log p H j ( y j ). ( G 1 , y 1 ) − log p G 1 ( y 1 ) − log p F 2 ( x 2 ) − l o g p ( F 1 x ) 1 ( F 1 , x 1 ) ( F 2 , x 2 )

Marginal MAP as Policy Learning x θ is inferred. In Marginal MAP, assignment of a part of the trace x x In a probabilistic program: x θ becomes the program output z ◮ x x z z . x θ . ◮ z z is marginalized over x x \ x z x x ◮ x x θ x MAP = arg max p P ( z z z ). x θ which x θ Determining x MAP corresponds to learning a policy x x x minimizes the expected path length   | x x θ | x | y y | y � � i ( x θ E x  − log p F θ i ) − log p G j ( y j )   x θ x x \ x x  i =1 j =1

Policy Learning through Probabilistic Inference Require: agent , Instances , Policies 1: instance ← Draw ( Instances ) 2: policy ← Draw ( Policies ) 3: cost ← Run ( agent , instance , policy ) 4: Observe (1, Bernoulli( e − cost )) 5: Print ( policy ) The log probability of the output policy is log p P ( policy ) = − cost ( policy ) + log p Policies ( policy ) + C When policies are drawn uniformly log p P ( policy ) = − cost ( policy ) + C ′

Canadian Traveller Problem CTP is a problem finding the shortest travel distance in a graph where some edges may be blocked. Given ◮ Undirected weighted graph G = ( V , E ). ◮ The initial and the final location nodes s and t . ◮ Edge weights w : E → R . ◮ Traversability probabilities: p o : E → (0 , 1]. find the shortest travel distance from s to t — the sum of weights of all traversed edges.

The Simplest CTP Instance — Two Roads Given ◮ two roads with probability being open p 1 and p 2 , ◮ costs of each road c 1 and c 2 , ◮ cost of bumping into a blocked road c b , learn the optimum policy q . (defquery tworoads 1 (loop [] 2 (let [o1 (sample (flip p1)) 3 o2 (sample (flip p2))] 4 (if (not (or o1 o2)) (recur) 5 (let [q (sample (uniform-continuous 0. 1.)) 6 s (sample (flip (- 1 q)))] 7 (let [distance (if s (if o1 c1 (+ c2 cb)) 8 (if o2 c2 (+ c1 cb)))] 9 (observe +factor+ (- distance)) 10 (predict :q q))))))) 11

Learning Stochastic Policy for CTP Depth-first search based policy: ◮ the agent traverses G in depth-first order. ◮ the policy specifies the probabilities of selecting each adjacent edge in every node. Require: CTP( G , s , t , w , p ) 1: for v ∈ V do 1 deg( v ) )) 1 policy ( v ) ← Draw (Dirichlet(1 2: 3: end for 4: repeat instance ← Draw (CTP( G , w , p )) 5: ( reached , distance ) ← StDFS ( instance , policy ) 6: 7: until reached e − distance � � 8: Observe (1, Bernoulli ) 9: Print ( policy )

Inference Results — CTP Travel Graphs Learned policies: open fraction 1.0 open fraction 0.9 open fraction 0.8 open fraction 0.7 open fraction 0.6 Line widths indicate the frequency of travelling each edge.

Summary ◮ Discovery of bilateral correspondence between probabilistic inference and policy learning for path finding. ◮ A new approach to policy learning based on the established correspondence. ◮ A realization of the approach for the Canadian traveller problem, where improved policies were consistently learned by probabilistic program inference.

Thank You

Path Finding under Uncertainty through Probabilistic Inference - PowerPoint PPT Presentation

Path Finding under Uncertainty through Probabilistic Inference David Tolpin, Jan Willem van de Meent, Brooks Paige, Frank Wood University of Oxford June 8th, 2015 Paper: http://arxiv.org/abs/1502.07314 Slides:

A * A path finding algorithm. A path finding algorithm. Given a state space, such as a

Uncertainty AIMA Chapter 13 Outline Uncertainty Uncertainty Probability Syntax and

Decision Making Under Uncertainty Making Decisions Under Uncertainty AI C LASS 10 (C H .

GRAPH TRAVERSAL PATH FINDING AND GRAPH TRAVERSAL Path finding refers to determining the shortest

7 Modelling Uncertainty Bayes theorem 7 Modelling Uncertainty Bayes theorem

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Finding the right path Finding the right path Finding the right path Finding the right path

Finding Shortest Paths Shortest Path Problem Shortest Path Problem We are given a graph G = ( V ,

Finding Shortest Paths Shortest Path Problem Shortest Path Problem Given a graph G = ( V , E )

Decision Making Privacy-Motivated . . . under Uncertainty: Uncertainty Leads to . . .

UNCERTAINTY IN KNOWLEDGE Ch. 9 Uncertainty in Knowledge 1 Sources of Uncertainty

On Path Generation, Path Following On Path Generation, Path Following and Time Coordination for

Using Off-Path and On-Path Signaling for Internet Security Saikat Guha, Paul Francis Cornell

What uncertainty do we get? Zhenwen Dai 11 October 2019 Zhenwen Dai What uncertainty do we get?

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

$TITLE M9_1.GMS: Two-Country Oligopoly, free entry, segmented markets * $ONTEXT YI YJ XI

Higher-Order Fourier Analysis: Applications to Algebraic Property Testing Yuichi Yoshida

Supporting cutting-edge healthcare through effective Estates contract management Andrew Selby

The strategic context of PFI/PPP The challenge of change a shared future. Paul McKenna Head of

Patterns of ideals of numerical semigroups Klara Stokes AMS Sectional Meeting Special Session

Applications of Markov Chains Markov Chain Definition Three key components (and one

Unit 4: Non-traditional Procurement Routes D39PZ: Procurement and Contracts 1 Selecting the

Shedding Light on EHM from Inclusive Electron Scattering off Protons in the Resonance Region and N*