Sokoban:( Enhancing(general(single2agent( - - PowerPoint PPT Presentation

sokoban enhancing general single2agent search methods
SMART_READER_LITE
LIVE PREVIEW

Sokoban:( Enhancing(general(single2agent( - - PowerPoint PPT Presentation

Sokoban:( Enhancing(general(single2agent( search(methods(using(domain( knowledge8 Andreas Junghanns Jonathan Schaeffer presented by Pascal Dblin What(is(Sokoban?8 Computer game Example: Goal: o Push stones with the man


slide-1
SLIDE 1

Sokoban:( Enhancing(general(single2agent( search(methods(using(domain( knowledge8

Andreas Junghanns – Jonathan Schaeffer presented by Pascal Düblin

slide-2
SLIDE 2

What(is(Sokoban?8

  • Example:
  • Computer game
  • Goal:
  • Push stones with the man

(smiley) on goal squares (red shaded squares)

  • Rules:
  • No pull, only push moves
  • If a stone cannot be pushed

and isn‘t on a goal square, the game is lost

slide-3
SLIDE 3

Sokobans(search2space8

Property( Specifics( 240Puzzle( Rubik‘s8Cube( Sokoban( Branching(factor8 Average8 Range8 2.378 1238 13.358 122158 128 021368 Solution(length8 Average8 Range8 100+8 12unkown8 188 12208 2608 9726748 Search2space(size8 Upper(bound8 10358 10198 10988 Calculation(of8 Lower(bound8 Full8 Incremental8 O(n)8 O(1)8 O(n)8 O(1)8 O(n3)8 O(n2)8 Underlying(graph8 Undirected8 Undirected8 Directed8

slide-4
SLIDE 4

Application2dependent( techniques8

  • Sokoban-solver: Rolling Stone
  • Node limit: 20 million nodes
  • Basis for Rolling-Stone: IDA*
  • 3 years of work
  • 90 Sokoban test problems
slide-5
SLIDE 5

Basic(Implementation:( IDA*8

  • IDA* = Iterative deepening A*
  • Similar approach like iterative deepening depth first

search

  • F-value of A* is limited
  • This limit surges with each iteration of the depth first

search

slide-6
SLIDE 6

Simple(lower(Bound8

  • Example:
  • Heuristic for IDA*
  • Manhattan Distance to

nearest goal square

  • Sum of all distances
  • In example: 5
  • Problems solved: 0
slide-7
SLIDE 7

Minimum(matching( lower(bound((R0)((1)8

  • Example:
  • Improved heuristic for

IDA*

  • Each goal square can
  • nly be taken from one

stone

  • Manhattan Distance to

nearest and reachable goal squares

1 A B8 C 2 3

slide-8
SLIDE 8

Minimum(matching( lower(bound((R0)((2)8

  • Example:
  • Algorithm realize that

the goal square of a stone is not always the nearest

18 A 28 B8 38 C 28 68 98 ∞ ∞ ∞ 68 28 18 1 A B8 C 2 3

slide-9
SLIDE 9

Minimum(matching( lower(bound((R0)((3)8

  • Example:
  • Algorithm realize that

the goal square of a stone is not always the nearest

18 A 28 B8 38 C 28 68 98 ∞ ∞ ∞ 68 28 18 Goals:( A( B( C( 1( 28 6( 98 2( 6( 28 28 3( 28 18 2( 1 A B8 C 2 3

slide-10
SLIDE 10

Minimum(matching( lower(bound((R0)((4)8

  • Example:
  • Heuristic value: 14
  • Better heuristic value, h

closer at h*

  • The result is an

enormous reduction of the search space

  • Problems solved: 0

Goals:( A( B( C( 1( 28 6( 98 2( 6( 28 28 3( 28 18 2( 1 A B8 C 2 3

slide-11
SLIDE 11

Transposition(table((R1)8

  • All visited states stored in the transposition table
  • Avoid visiting duplicated states/nodes
  • Duplicate elimination before expanding next node
  • Similar to close list
  • Problems solved: 5
slide-12
SLIDE 12

Move(ordering((R2)8

  • Order how nodes will be expanded
  • Actions (Moves) are sorted with the most promising

actions first.

  • Sorting criteria:
  • 1. Move the same stone
  • 2. Move that mimimize the lower bound (optimal move)
  • 3. Stone first that is nearest of its goal square
  • 4. Non optimal moves sorted by the same criteria as

above

  • Problems solved: 4
slide-13
SLIDE 13

Deadlock(table((R3)8

  • Example:
  • 4 x 5 region
  • Find all arangements of

stones, wall squares and the man

  • Store all deadlocks
  • During IDA* search:

check state against the deadlock table

  • Problems solved: 5
slide-14
SLIDE 14

Tunnel(macros((R4)8

  • Macros:

Combine a group of moves

  • All tunnel moves are

made all at once

  • Problems solved: 6
slide-15
SLIDE 15

Tunnel(macros((R4)8

  • Macros:

Combine a group of moves

  • All tunnel moves are

made all at once

  • Problems solved: 6
slide-16
SLIDE 16

Tunnel(macros((R4)8

  • Macros:

Combine a group of moves

  • All tunnel moves are

made all at once

  • Problems solved: 6
slide-17
SLIDE 17

Tunnel(macros((R4)8

  • Macros:

Combine a group of moves

  • All tunnel moves are

made all at once

  • Problems solved: 6
slide-18
SLIDE 18

Tunnel(macros((R4)8

  • Macros:

Combine a group of moves

  • All tunnel moves are

made all at once

  • Problems solved: 6
slide-19
SLIDE 19

Goal(macros((R5)((1)8

  • Example:
  • In Sokoban goal

squares are often grouped in a gaol area

  • In this case Sokoban

can be split in two subproblems:

slide-20
SLIDE 20

Goal(macros((R5)((1)8

  • Example:
  • In Sokoban goal

squares are often grouped in a gaol area

  • In this case Sokoban

can be split in two subproblems:

  • Push a stone on an entrance

square

slide-21
SLIDE 21

Goal(macros((R5)((1)8

  • Example:
  • In Sokoban goal

squares are often grouped in a gaol area

  • In this case Sokoban

can be split in two subproblems:

  • Push a stone on an entrance

square

  • Push a the stone on a goal

square

slide-22
SLIDE 22

Goal(macros((R5)((2)8

  • Example:
  • Precomputed for each

problem and room, in which specified order the man has to push the stones on their goal squares.

slide-23
SLIDE 23

Goal(macros((R5)((3)8

  • Example:
  • If a stone is on an

entrance square, the goal macro will be executed

  • Problems solved: 17
slide-24
SLIDE 24

Goal(macros((R5)((4)8

  • Example:
  • Goal macro moves are

grouped to one

  • All other possible

moves are ignored

  • Problems solved: 17
slide-25
SLIDE 25

Goal(cuts((R6)((1)8

  • If push „b“ starts a goal

macro (stone reaches a entrance square), all childs of the parent of „b“ will be pruned

  • Problems solved: 24
  • Example:

b

slide-26
SLIDE 26

Goal(cuts((R6)((2)8

a8 b8 Goal( macro8 c8 (d8

  • If push „b“ starts a goal

macro (stone reaches a entrance square), all childs of the parent of „b“ will be pruned

  • In example the branch

with move „c“ and „d“ is pruned after the goal macro is reached

  • Problems solved: 24
slide-27
SLIDE 27

Goal(cuts((R6)((3)8

a8 b8 Goal( macro8 c8 (d8

  • If push „b“ starts a goal

macro (stone reaches a entrance square), all childs of the parent of „b“ will be pruned

  • In example the branch

with move „c“ and „d“ is pruned after the goal macro is reached

  • Problems solved: 24
slide-28
SLIDE 28

Pa_ern(search((R7)((1)8

  • Goal: Find Deadlock or increase lower bound

estimation (Slide: Overestimation)

  • PIDA*: IDA* version for pattern search
  • Example:
slide-29
SLIDE 29

Pa_ern(search((R7)((2)8

  • Test maze:
  • Original maze:
slide-30
SLIDE 30

Pa_ern(search((R7)((2)8

  • Test maze:
  • Original maze:
slide-31
SLIDE 31

Pa_ern(search((R7)((2)8

  • Test maze:
  • Original maze:
slide-32
SLIDE 32

Pa_ern(search((R7)((2)8

  • Test maze:
  • Original maze:
  • => Deadlock occurs
slide-33
SLIDE 33

Pa_ern(search((R7)((3)8

  • Minimum pattern:
  • Final steps:
  • Calculate minimum

deadlock pattern

  • Add pattern to

pattern table

  • During IDA* search:
  • Check state against

pattern search table

  • Problems solved: 48
slide-34
SLIDE 34

Relevance(cuts((R8)8

  • Goal: to find

independent subproblems

  • Move only if the last #

number of moves influence it

  • Properties of influence/

relevance:

1. Alternatives 2. Goal-Skew 3. Connection 4. Tunnel

  • Problems solved: 50
slide-35
SLIDE 35

Overestimation((R9)8

  • A* is optimal, if h() is admissible
  • Admissible: n, h(n) ≤ C(n) | C(n): actual cost to

reach goal from n

  • Non-admissible h => search often more accurate,

but no longer optimal

  • Sum of max penalty per stone added to h

(penalties calculated from pattern search)

  • => optimality no longer garanteed
  • Problems solved: 54
slide-36
SLIDE 36

Rapid(random(restarts( (R10)8

  • Restarts with more randomization in move ordering

(less strictness)

  • A certain number of restarts with the same f-limit
  • If f-limit increases, the randomization of move
  • rdering will fall back to zero
  • Problems solved: 57
slide-37
SLIDE 37

Comparison(table8

#8 solved( In8prop.8 to890( In8prop.8 to857( #8less8 solved8 without8this8 approach( In8prop.8 to890( In8prop.8 To857(

Transposition(table8 58 5.6%8 8.8%8 198 21.1%8 33.3%8 Move(ordering8 48 4.4%8 7.0%8 0218 021.1%8 021.8%8 Deadlock(table8 58 5.6%8 8.8%8 0218 021.1%8 021.8%8 Tunnel(macros8 68 6.7%8 10.5%8 0218 021.1%8 021.8%8 Goal(macros8 178 18.9%8 29.8%8 338 36.7%8 57.9%8 Goal(cuts8 248 26.7%8 42.1%8 0218 021.1%8 021.8%8 Pa_ern(search8 488 53.3%8 84.2%8 228 24.4%8 38.6%8 Relevance(cuts8 508 55.6%8 87.7%8 0218 021.1%8 021.8%8 Overestimation8 548 60%8 94.7%8 0218 021.1%8 021.8%8 RR(Restart8 578 63.3%8 100%8 0218 021.1%8 021.8%8

slide-38
SLIDE 38

Comparison(table8

#8 solved( In8prop.8 to890( In8prop.8 to857( #8less8 solved8 without8this8 approach( In8prop.8 to890( In8prop.8 To857(

Minimum(matching8 08 0%8 0%8 0218 021.1%8 021.8%8 Transposition(table8 58 5.6%8 8.8%8 198 21.1%8 33.3%8 Move(ordering8 48 4.4%8 7.0%8 0218 021.1%8 021.8%8 Deadlock(table8 58 5.6%8 8.8%8 0218 021.1%8 021.8%8 Tunnel(macros8 68 6.7%8 10.5%8 0218 021.1%8 021.8%8 Goal(macros8 178 18.9%8 29.8%8 338 36.7%8 57.9%8 Goal(cuts8 248 26.7%8 42.1%8 0218 021.1%8 021.8%8 Pa_ern(search8 488 53.3%8 84.2%8 228 24.4%8 38.6%8 Relevance(cuts8 508 55.6%8 87.7%8 0218 021.1%8 021.8%8 Overestimation8 548 60%8 94.7%8 0218 021.1%8 021.8%8 RR(Restart8 578 63.3%8 100%8 0218 021.1%8 021.8%8

slide-39
SLIDE 39

Comparison(table8

#8 solved( In8prop.8 to890( In8prop.8 to857( #8less8 solved8 without8this8 approach( In8prop.8 to890( In8prop.8 To857(

Minimum(matching8 08 0%8 0%8 0218 021.1%8 021.8%8 Transposition(table8 58 5.6%8 8.8%8 198 21.1%8 33.3%8 Move(ordering8 48 4.4%8 7.0%8 0218 021.1%8 021.8%8 Deadlock(table8 58 5.6%8 8.8%8 0218 021.1%8 021.8%8 Tunnel(macros8 68 6.7%8 10.5%8 0218 021.1%8 021.8%8 Goal(macros8 178 18.9%8 29.8%8 338 36.7%8 57.9%8 Goal(cuts8 248 26.7%8 42.1%8 0218 021.1%8 021.8%8 Pa_ern(search8 488 53.3%8 84.2%8 228 24.4%8 38.6%8 Relevance(cuts8 508 55.6%8 87.7%8 0218 021.1%8 021.8%8 Overestimation8 548 60%8 94.7%8 0218 021.1%8 021.8%8 RR(Restart8 578 63.3%8 100%8 0218 021.1%8 021.8%8

slide-40
SLIDE 40

Comparison(table8

#8 solved( In8prop.8 to890( In8prop.8 to857( #8less8 solved8 without8this8 approach( In8prop.8 to890( In8prop.8 To857(

Minimum(matching8 08 0%8 0%8 0218 021.1%8 021.8%8 Transposition(table8 58 5.6%8 8.8%8 198 21.1%8 33.3%8 Move(ordering8 48 4.4%8 7.0%8 0218 021.1%8 021.8%8 Deadlock(table8 58 5.6%8 8.8%8 0218 021.1%8 021.8%8 Tunnel(macros8 68 6.7%8 10.5%8 0218 021.1%8 021.8%8 Goal(macros8 178 18.9%8 29.8%8 338 36.7%8 57.9%8 Goal(cuts8 248 26.7%8 42.1%8 0218 021.1%8 021.8%8 Pa_ern(search8 488 53.3%8 84.2%8 228 24.4%8 38.6%8 Relevance(cuts8 508 55.6%8 87.7%8 0218 021.1%8 021.8%8 Overestimation8 548 60%8 94.7%8 0218 021.1%8 021.8%8 RR(Restart8 578 63.3%8 100%8 0218 021.1%8 021.8%8

slide-41
SLIDE 41

Independent( enhancement(test8

08 108 208 308 408 508 608 Minimum(matching( Transposition(table8 Move(ordering8 Deadlock(table8 Tunnel(macros8 Goal(macros8 Goal(cuts8 Pa_ern(search8 Relevance(cuts8 Overestimation8 RRR8 #(of(solved(without( enhancement8 #(of(lost(solved( problems8

slide-42
SLIDE 42

Conclusions8

  • In such a vaste search

space like sokoban problems, domain- dependent approaches can help solving them

  • In some case, the way

to find domain- dependent knowledge is to study the problem solutions

  • IDA* is simple to

implement and solve domain-independent problems

  • Combination with

domain-dependent knowledge can result in a greatly improved search performance

slide-43
SLIDE 43

Discussion 88

  • Questions?