Planning and Optimization G1. Heuristic Search: AO & LAO Part I - PowerPoint PPT Presentation

Planning and Optimization G1. Heuristic Search: AO ∗ & LAO ∗ Part I Gabriele R¨ oger and Thomas Keller Universit¨ at Basel December 3, 2018

A ∗ with Backward Induction Heuristic Search Motivation Summary Content of this Course Tasks Progression/ Regression Classical Complexity Heuristics Planning MDPs Blind Methods Probabilistic Heuristic Search Monte-Carlo Methods

A ∗ with Backward Induction Heuristic Search Motivation Summary Heuristic Search

A ∗ with Backward Induction Heuristic Search Motivation Summary Heuristic Search: Recap Heuristic Search Algorithms Heuristic search algorithms use heuristic functions to (partially or fully) determine the order of node expansion. (From Lecture 15 of the AI course last semester)

A ∗ with Backward Induction Heuristic Search Motivation Summary Best-first Search: Recap Best-first Search A best-first search is a heuristic search algorithm that evaluates search nodes with an evaluation function f and always expands a node n with minimal f ( n ) value. (From Lecture 15 of the AI course last semester)

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search: Recap A ∗ Search A ∗ is the best-first search algorithm with evaluation function f ( n ) = g ( n ) + h ( n . state). (From Lecture 15 of the AI course last semester)

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search (With Reopening): Example 18 s 0 8 5 12 14 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search (With Reopening): Example 0 + 18 18 s 0 s 0 8 5 8 5 8 + 12 5 + 14 s 1 s 2 12 14 s 1 s 2 108 10 4 10 4 10 8 12 s 3 s 4 s 5 s 5 s 3 s 5 4 18 + 12 16 + 6 12 + 4 15 + 4 6 8 10 8 s 4 8 6 s 6 s 6 s 6 0 20 + 0 23 + 0

A ∗ with Backward Induction Heuristic Search Motivation Summary Motivation

A ∗ with Backward Induction Heuristic Search Motivation Summary From A ∗ to AO ∗ Equivalent of A ∗ in (acyclic) probabilistic planning is AO ∗ Even though we know A ∗ and foundations of probabilistic planning, the generalization is far from straightforward: e.g., in A ∗ , g ( n ) is cost from root n 0 to n equivalent in AO ∗ is expected cost from n 0 to n

A ∗ with Backward Induction Heuristic Search Motivation Summary Expected Cost to Reach State Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Expected cost to reach any of the leaves is infinite or undefined (neither is reached with probability 1).

A ∗ with Backward Induction Heuristic Search Motivation Summary From A ∗ to AO ∗ Equivalent of A ∗ in (acyclic) probabilistic planning is AO ∗ Even though we know A ∗ and foundations of probabilistic planning, the generalization is far from straightforward: e.g., in A ∗ , g ( n ) is cost from root n 0 to n equivalent in AO ∗ is expected cost from n 0 to n alternative could be expected cost from n 0 to n given n is reached

A ∗ with Backward Induction Heuristic Search Motivation Summary Expected Cost to Reach State Given It Is Reached Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Conditional probability is misleading: s 2 would be expanded, which isn’t part of the best looking option

A ∗ with Backward Induction Heuristic Search Motivation Summary The Best Looking Action Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Conditional probability is misleading: s 2 would be expanded, which isn’t part of the best looking option: with state-value estimate ˆ V ( s ) := h ( s ), greedy action a ˆ V ( s ) = a 1

A ∗ with Backward Induction Heuristic Search Motivation Summary Expansion in Best Solution Graph AO ∗ uses different idea: AO ∗ keeps track of best solution graph AO ∗ expands a state that can be reached from s 0 by only applying greedy actions ⇒ no g -value equivalent required

A ∗ with Backward Induction Heuristic Search Motivation Summary Expansion in Best Solution Graph AO ∗ uses different idea: AO ∗ keeps track of best solution graph AO ∗ expands a state that can be reached from s 0 by only applying greedy actions ⇒ no g -value equivalent required Equivalent version of A ∗ built on this idea can be derived ⇒ A ∗ with backward induction Since change is non-trivial, we focus on A ∗ variant now and generalize later to acyclic probabilistic tasks (AO ∗ ) and probabilistic tasks in general (LAO ∗ )

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with Backward Induction

A ∗ with Backward Induction Heuristic Search Motivation Summary Transition Systems A ∗ with backward induction distinguishes three transition systems: The transition system T = � S , L , c , T , s 0 , S ⋆ � ⇒ given implicitly The explicated graph ˆ T t = � ˆ S t , L , c , ˆ T t , s 0 , S ⋆ � ⇒ the part of T explicitly considered during search The partial solution graph ˆ t = � ˆ t , L , c , ˆ T ⋆ S ⋆ T ⋆ t , s 0 , S ⋆ � ⇒ The part of ˆ T t that contains best solution ˆ ˆ T ⋆ T T t s 0 t

A ∗ with Backward Induction Heuristic Search Motivation Summary Explicated Graph Expanding a state s at time step t explicates all successors s ′ ∈ succ( s ) by adding them to explicated graph: T t = � ˆ ˆ S t − 1 ∪ succ( s ) , L , c , ˆ T t − 1 ∪ {� s , l , s ′ � ∈ T } , s 0 , S ⋆ } Each explicated state is annotated with state-value estimate ˆ V t ( s ) that describes estimated cost to a goal at time step t When state s ′ is explicated and s ′ / ∈ ˆ S t − 1 , its state-value estimate is initialized to ˆ V t ( s ′ ) := h ( s ′ ) We call leaf states of ˆ T t fringe states

A ∗ with Backward Induction Heuristic Search Motivation Summary Partial Solution Graph The partial solution graph ˆ t is the subgraph of ˆ T ⋆ T t that is spanned by the smallest set of states ˆ S ⋆ t that satisfies: s 0 ∈ ˆ S ⋆ t t , s ′ ∈ ˆ T t , then s ′ in ˆ if s ∈ ˆ V t ( s ) ( s ) , s ′ � ∈ ˆ S t and � s , a ˆ S ⋆ S ⋆ t The partial solution graph forms a sequence of states � s 0 , . . . , s n � , starting with the initial state s 0 and ending in the greedy fringe state s n

A ∗ with Backward Induction Heuristic Search Motivation Summary Backward Induction A ∗ with backward induction does not maintain static open list State-value estimates determine partial solution graph Partial solution graph determines which state is expanded (Some) state-value estimates are updated in time step t by backward induction: ˆ c ( l ) + ˆ V t ( s ′ ) V t ( s ) = min � s , l , s ′ �∈ ˆ T t ( s )

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction A ∗ with backward induction for classical planning task T explicate s 0 while greedy fringe state s / ∈ S ⋆ : expand s perform backward induction of states in ˆ T ⋆ t − 1 in reverse order return ˆ T ⋆ t

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 18 s 0 s 0 8 5 12 14 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 19 s 0 s 0 8 5 8 5 12 14 12 14 s 1 s 2 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 19 s 0 s 0 8 5 8 5 12 14 12 14 s 1 s 2 s 1 s 2 10 4 10 10 8 12 s 3 s 5 s 5 4 4 6 8 s 4 8 6 s 6 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 4 10 10 8 12 s 3 s 5 s 5 4 8 6 8 8 s 4 8 6 s 6 s 6 0 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 10 4 10 4 10 8 8 12 12 s 3 s 5 s 3 s 5 4 8 6 8 8 s 4 s 4 8 6 6 s 6 s 6 0 0

A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 10 4 10 4 10 8 8 12 12 s 3 s 5 s 3 s 5 4 8 6 8 8 s 4 s 4 8 6 6 s 6 s 6 s 6 0 0

A ∗ with Backward Induction Heuristic Search Motivation Summary Equivalence of A ∗ and A ∗ with Backward Induction Theorem A ∗ and A ∗ with Backward Induction expand the same set of states if run with identical admissible heuristic h and identical tie-breaking criterion. Proof Sketch. The proof shows that there is always a unique state s in greedy fringe of A ∗ with backward induction f ( s ) = g ( s ) + h ( s ) is minimal among all fringe states g ( s ) of fringe node s encoded in greedy action choices h ( s ) of fringe node equal to ˆ V t ( s )

A ∗ with Backward Induction Heuristic Search Motivation Summary Summary

Planning and Optimization G1. Heuristic Search: AO & LAO Part I - PowerPoint PPT Presentation

Planning and Optimization G1. Heuristic Search: AO & LAO Part I Gabriele R oger and Thomas Keller Universit at Basel December 3, 2018 A with Backward Induction Heuristic Search Motivation Summary Content of this Course

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

P2P Combinatorial Optimization Amir H. Payberah (amir@sics.se) P2P Combinatorial Optimization, 13

Classical Planning Systems ICS 271 Fall 2014 Outline: Planning Planning environments

Planning and Optimization December 4, 2019 G1. Factored MDPs G1.1 Factored MDPs Planning and

Planning and Optimization October 16, 2019 C2. Delete Relaxation: Properties of Relaxed

Planning 2.0 BLMs Final Planning Rule http://www.blm.gov/plan2 1 Planning 2.0 Outline

Classical Planning Systems Chapter 10 R&N ICS 271 Fall 2016 Outline: Planning Planning

Convex Optimization by Stephen Boyd, and Lieven Vandenberghe. Optimization for Machine Learning by

Evolutionary Algorithm 2. Swarm Intelligence and Ant Colony Optimization Ant Colony Optimization

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

Five Steps to Optimization Five Steps to Optimization Beyond Best Practices Beyond Best

Introduction to Optimization Dr. Mihail October 23, 2018 (Dr. Mihail) Optimization October 23,

Optimization of HPSG Grammar Implementations in Trale Georgiana Dinu Optimization of HPSG

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

Epistemic Game Theory Lecture 4 ESSLLI12, Opole Eric Pacuit Olivier Roy TiLPS, Tilburg

Introduction to Game Theory Mehdi Dastani BBL-521 M.M.Dastani@uu.nl Extensive Games

A note on the complexity of backward induction games Jakub Szymanik RAIN @ NASSLLI 2012 Outline

Complexity of backward induction games Jakub Szymanik October 17, 2012 Outline Introduction

Wireless Network Pricing Chapter 6: Oligopoly Pricing Jianwei Huang & Lin Gao Network

Imperfect Information Extensive Form Games CMPUT 654: Modelling Human Strategic Behaviour

Approximate Cross-Validation and Dynamic Experiments for Policy Choice Maximilian Kasy

Comparing parallel and sequential Selfish Routing in the Atomic Players setting Pattarawit

Planning and Optimization G1. Heuristic Search: AO & LAO Part I - PowerPoint PPT Presentation

Planning and Optimization G1. Heuristic Search: AO & LAO Part I Gabriele R oger and Thomas Keller Universit at Basel December 3, 2018 A with Backward Induction Heuristic Search Motivation Summary Content of this Course

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

P2P Combinatorial Optimization Amir H. Payberah (amir@sics.se) P2P Combinatorial Optimization, 13

Classical Planning Systems ICS 271 Fall 2014 Outline: Planning Planning environments

Planning and Optimization December 4, 2019 G1. Factored MDPs G1.1 Factored MDPs Planning and

Planning and Optimization October 16, 2019 C2. Delete Relaxation: Properties of Relaxed

Planning 2.0 BLMs Final Planning Rule http://www.blm.gov/plan2 1 Planning 2.0 Outline

Classical Planning Systems Chapter 10 R&amp;N ICS 271 Fall 2016 Outline: Planning Planning

Convex Optimization by Stephen Boyd, and Lieven Vandenberghe. Optimization for Machine Learning by

Evolutionary Algorithm 2. Swarm Intelligence and Ant Colony Optimization Ant Colony Optimization

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

Five Steps to Optimization Five Steps to Optimization Beyond Best Practices Beyond Best

Introduction to Optimization Dr. Mihail October 23, 2018 (Dr. Mihail) Optimization October 23,

Optimization of HPSG Grammar Implementations in Trale Georgiana Dinu Optimization of HPSG

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

Epistemic Game Theory Lecture 4 ESSLLI12, Opole Eric Pacuit Olivier Roy TiLPS, Tilburg

Introduction to Game Theory Mehdi Dastani BBL-521 M.M.Dastani@uu.nl Extensive Games

A note on the complexity of backward induction games Jakub Szymanik RAIN @ NASSLLI 2012 Outline

Complexity of backward induction games Jakub Szymanik October 17, 2012 Outline Introduction

Wireless Network Pricing Chapter 6: Oligopoly Pricing Jianwei Huang &amp; Lin Gao Network

Imperfect Information Extensive Form Games CMPUT 654: Modelling Human Strategic Behaviour

Approximate Cross-Validation and Dynamic Experiments for Policy Choice Maximilian Kasy

Comparing parallel and sequential Selfish Routing in the Atomic Players setting Pattarawit

Classical Planning Systems Chapter 10 R&N ICS 271 Fall 2016 Outline: Planning Planning

Wireless Network Pricing Chapter 6: Oligopoly Pricing Jianwei Huang & Lin Gao Network