planning and optimization
play

Planning and Optimization G1. Heuristic Search: AO & LAO Part I - PowerPoint PPT Presentation

Planning and Optimization G1. Heuristic Search: AO & LAO Part I Gabriele R oger and Thomas Keller Universit at Basel December 3, 2018 A with Backward Induction Heuristic Search Motivation Summary Content of this Course


  1. Planning and Optimization G1. Heuristic Search: AO ∗ & LAO ∗ Part I Gabriele R¨ oger and Thomas Keller Universit¨ at Basel December 3, 2018

  2. A ∗ with Backward Induction Heuristic Search Motivation Summary Content of this Course Tasks Progression/ Regression Classical Complexity Heuristics Planning MDPs Blind Methods Probabilistic Heuristic Search Monte-Carlo Methods

  3. A ∗ with Backward Induction Heuristic Search Motivation Summary Heuristic Search

  4. A ∗ with Backward Induction Heuristic Search Motivation Summary Heuristic Search: Recap Heuristic Search Algorithms Heuristic search algorithms use heuristic functions to (partially or fully) determine the order of node expansion. (From Lecture 15 of the AI course last semester)

  5. A ∗ with Backward Induction Heuristic Search Motivation Summary Best-first Search: Recap Best-first Search A best-first search is a heuristic search algorithm that evaluates search nodes with an evaluation function f and always expands a node n with minimal f ( n ) value. (From Lecture 15 of the AI course last semester)

  6. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search: Recap A ∗ Search A ∗ is the best-first search algorithm with evaluation function f ( n ) = g ( n ) + h ( n . state). (From Lecture 15 of the AI course last semester)

  7. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search (With Reopening): Example 18 s 0 8 5 12 14 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

  8. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ Search (With Reopening): Example 0 + 18 18 s 0 s 0 8 5 8 5 8 + 12 5 + 14 s 1 s 2 12 14 s 1 s 2 108 10 4 10 4 10 8 12 s 3 s 4 s 5 s 5 s 3 s 5 4 18 + 12 16 + 6 12 + 4 15 + 4 6 8 10 8 s 4 8 6 s 6 s 6 s 6 0 20 + 0 23 + 0

  9. A ∗ with Backward Induction Heuristic Search Motivation Summary Motivation

  10. A ∗ with Backward Induction Heuristic Search Motivation Summary From A ∗ to AO ∗ Equivalent of A ∗ in (acyclic) probabilistic planning is AO ∗ Even though we know A ∗ and foundations of probabilistic planning, the generalization is far from straightforward: e.g., in A ∗ , g ( n ) is cost from root n 0 to n equivalent in AO ∗ is expected cost from n 0 to n

  11. A ∗ with Backward Induction Heuristic Search Motivation Summary Expected Cost to Reach State Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Expected cost to reach any of the leaves is infinite or undefined (neither is reached with probability 1).

  12. A ∗ with Backward Induction Heuristic Search Motivation Summary From A ∗ to AO ∗ Equivalent of A ∗ in (acyclic) probabilistic planning is AO ∗ Even though we know A ∗ and foundations of probabilistic planning, the generalization is far from straightforward: e.g., in A ∗ , g ( n ) is cost from root n 0 to n equivalent in AO ∗ is expected cost from n 0 to n alternative could be expected cost from n 0 to n given n is reached

  13. A ∗ with Backward Induction Heuristic Search Motivation Summary Expected Cost to Reach State Given It Is Reached Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Conditional probability is misleading: s 2 would be expanded, which isn’t part of the best looking option

  14. A ∗ with Backward Induction Heuristic Search Motivation Summary The Best Looking Action Consider the following expansion of state s 0 : s 0 1 1 a 0 a 1 . 99 . 01 . 5 . 5 s 1 s 2 s 3 s 4 100 1 2 2 Conditional probability is misleading: s 2 would be expanded, which isn’t part of the best looking option: with state-value estimate ˆ V ( s ) := h ( s ), greedy action a ˆ V ( s ) = a 1

  15. A ∗ with Backward Induction Heuristic Search Motivation Summary Expansion in Best Solution Graph AO ∗ uses different idea: AO ∗ keeps track of best solution graph AO ∗ expands a state that can be reached from s 0 by only applying greedy actions ⇒ no g -value equivalent required

  16. A ∗ with Backward Induction Heuristic Search Motivation Summary Expansion in Best Solution Graph AO ∗ uses different idea: AO ∗ keeps track of best solution graph AO ∗ expands a state that can be reached from s 0 by only applying greedy actions ⇒ no g -value equivalent required Equivalent version of A ∗ built on this idea can be derived ⇒ A ∗ with backward induction Since change is non-trivial, we focus on A ∗ variant now and generalize later to acyclic probabilistic tasks (AO ∗ ) and probabilistic tasks in general (LAO ∗ )

  17. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with Backward Induction

  18. A ∗ with Backward Induction Heuristic Search Motivation Summary Transition Systems A ∗ with backward induction distinguishes three transition systems: The transition system T = � S , L , c , T , s 0 , S ⋆ � ⇒ given implicitly The explicated graph ˆ T t = � ˆ S t , L , c , ˆ T t , s 0 , S ⋆ � ⇒ the part of T explicitly considered during search The partial solution graph ˆ t = � ˆ t , L , c , ˆ T ⋆ S ⋆ T ⋆ t , s 0 , S ⋆ � ⇒ The part of ˆ T t that contains best solution ˆ ˆ T ⋆ T T t s 0 t

  19. A ∗ with Backward Induction Heuristic Search Motivation Summary Explicated Graph Expanding a state s at time step t explicates all successors s ′ ∈ succ( s ) by adding them to explicated graph: T t = � ˆ ˆ S t − 1 ∪ succ( s ) , L , c , ˆ T t − 1 ∪ {� s , l , s ′ � ∈ T } , s 0 , S ⋆ } Each explicated state is annotated with state-value estimate ˆ V t ( s ) that describes estimated cost to a goal at time step t When state s ′ is explicated and s ′ / ∈ ˆ S t − 1 , its state-value estimate is initialized to ˆ V t ( s ′ ) := h ( s ′ ) We call leaf states of ˆ T t fringe states

  20. A ∗ with Backward Induction Heuristic Search Motivation Summary Partial Solution Graph The partial solution graph ˆ t is the subgraph of ˆ T ⋆ T t that is spanned by the smallest set of states ˆ S ⋆ t that satisfies: s 0 ∈ ˆ S ⋆ t t , s ′ ∈ ˆ T t , then s ′ in ˆ if s ∈ ˆ V t ( s ) ( s ) , s ′ � ∈ ˆ S t and � s , a ˆ S ⋆ S ⋆ t The partial solution graph forms a sequence of states � s 0 , . . . , s n � , starting with the initial state s 0 and ending in the greedy fringe state s n

  21. A ∗ with Backward Induction Heuristic Search Motivation Summary Backward Induction A ∗ with backward induction does not maintain static open list State-value estimates determine partial solution graph Partial solution graph determines which state is expanded (Some) state-value estimates are updated in time step t by backward induction: ˆ c ( l ) + ˆ V t ( s ′ ) V t ( s ) = min � s , l , s ′ �∈ ˆ T t ( s )

  22. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction A ∗ with backward induction for classical planning task T explicate s 0 while greedy fringe state s / ∈ S ⋆ : expand s perform backward induction of states in ˆ T ⋆ t − 1 in reverse order return ˆ T ⋆ t

  23. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 18 s 0 s 0 8 5 12 14 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

  24. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 19 s 0 s 0 8 5 8 5 12 14 12 14 s 1 s 2 s 1 s 2 10 4 10 8 12 s 3 s 5 4 6 8 s 4 8 6 s 6 0

  25. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 19 s 0 s 0 8 5 8 5 12 14 12 14 s 1 s 2 s 1 s 2 10 4 10 10 8 12 s 3 s 5 s 5 4 4 6 8 s 4 8 6 s 6 0

  26. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 4 10 10 8 12 s 3 s 5 s 5 4 8 6 8 8 s 4 8 6 s 6 s 6 0 0

  27. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 10 4 10 4 10 8 8 12 12 s 3 s 5 s 3 s 5 4 8 6 8 8 s 4 s 4 8 6 6 s 6 s 6 0 0

  28. A ∗ with Backward Induction Heuristic Search Motivation Summary A ∗ with backward induction 18 20 s 0 s 0 8 5 8 5 12 14 12 18 s 1 s 2 s 1 s 2 10 10 4 10 4 10 8 8 12 12 s 3 s 5 s 3 s 5 4 8 6 8 8 s 4 s 4 8 6 6 s 6 s 6 s 6 0 0

  29. A ∗ with Backward Induction Heuristic Search Motivation Summary Equivalence of A ∗ and A ∗ with Backward Induction Theorem A ∗ and A ∗ with Backward Induction expand the same set of states if run with identical admissible heuristic h and identical tie-breaking criterion. Proof Sketch. The proof shows that there is always a unique state s in greedy fringe of A ∗ with backward induction f ( s ) = g ( s ) + h ( s ) is minimal among all fringe states g ( s ) of fringe node s encoded in greedy action choices h ( s ) of fringe node equal to ˆ V t ( s )

  30. A ∗ with Backward Induction Heuristic Search Motivation Summary Summary

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend