An Analytical Approach to the BFS vs. DFS Algorithm Selection Problem - - PowerPoint PPT Presentation

an analytical approach to the bfs vs dfs algorithm
SMART_READER_LITE
LIVE PREVIEW

An Analytical Approach to the BFS vs. DFS Algorithm Selection Problem - - PowerPoint PPT Presentation

An Analytical Approach to the BFS vs. DFS Algorithm Selection Problem 1 Tom Everitt Marcus Hutter Australian National University September 3, 2015 Everitt, T. and Hutter, M. (2015a). Analytical Results on the BFS vs. DFS Algorithm Selection


slide-1
SLIDE 1

An Analytical Approach to the BFS vs. DFS Algorithm Selection Problem1

Tom Everitt Marcus Hutter

Australian National University

September 3, 2015 Everitt, T. and Hutter, M. (2015a). Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part I: Tree Search. In 28th Australian Joint Conference on Artificial Intelligence Everitt, T. and Hutter, M. (2015b). Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part II: Graph Search. In 28th Australian Joint Conference on Artificial Intelligence

1BFS=Breadth-first search, DFS=Depth-first search Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 1 / 21

slide-2
SLIDE 2

Outline

1

Motivation and Background

2

Simple model Expected Runtimes Decision Boundary

3

More General Models

4

Experimental Results

5

Conclusions

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 2 / 21

slide-3
SLIDE 3

Motivation

(Graph) search is a fundamental AI problem: planning, learning, problem solving Hundreds of algorithms have been developed, including metaheuristics such as simulated annealing, genetic algorithms. These are often heuristically motivated, lacking solid theoretical footing. For theoretical approach, return to basics: BFS and DFS. So far, mainly worst-case results have been available (we focus on average/expected runtime).

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 3 / 21

slide-4
SLIDE 4

Breadth-first Search (BFS)

Korf et al. (2001) found a clever way to analyse IDA*, which essentially is a generalisation of BFS. Later generalised by Zahavi et al. (2010). Both are essentially worst-case re- sults.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 4 / 21

slide-5
SLIDE 5

Depth-first Search (DFS)

Knuth (1975) developed a way to estimate search tree size and DFS worst-case performance. s0 Assume the same number of children in other branches. Estimate ≈ 2 · 3 · 3 · 2 = 36 leaves. Refinements and applications Purdom (1978): Use several branches instead of one Chen (1992): Use stratified sampling Kilby et al. (2006): The estimates can be used to select best SAT algorithm

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 5 / 21

slide-6
SLIDE 6

Potential gains

We focus on average or expected runtime of BFS and DFS rather than worst-case. Selling points: Good to have an idea how long a search might take Useful for algorithm selection (Rice, 1975) May be used for constructing meta-heuristics Precise understanding of basics often useful

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 6 / 21

slide-7
SLIDE 7

BFS and DFS

BFS and DFS are opposites. BFS 1 2 4 8 9 5 10 11 3 6 12 13 7 14 15 focuses near the start node DFS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 focuses far from the start node

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 7 / 21

slide-8
SLIDE 8

Formal setups

We analyse BFS and DFS expected runtime in a sequence of increasingly general models.

1 Tree with a single level of goals 2 Tree with multiple levels of goals 3 General graph

Increasingly coarse approximations are required

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 8 / 21

slide-9
SLIDE 9

Simplest model - Tree with Single Goal Level

Our simplest model assumes a complete tree with: D = 3, A max search depth D ∈ N,

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 9 / 21

slide-10
SLIDE 10

Simplest model - Tree with Single Goal Level

Our simplest model assumes a complete tree with: D = 3, g = 2, A max search depth D ∈ N, A goal level g ∈ {0, . . . , D}

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 9 / 21

slide-11
SLIDE 11

Simplest model - Tree with Single Goal Level

Our simplest model assumes a complete tree with: D = 3, g = 2, p = 1/3 A max search depth D ∈ N, A goal level g ∈ {0, . . . , D} Nodes on level g are goals with goal probability p ∈ [0, 1] (iid).

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 9 / 21

slide-12
SLIDE 12

Simplest model - Tree with Single Goal Level

Our simplest model assumes a complete tree with: D = 3, g = 2, p = 1/3 A max search depth D ∈ N, A goal level g ∈ {0, . . . , D} Nodes on level g are goals with goal probability p ∈ [0, 1] (iid).

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 9 / 21

slide-13
SLIDE 13

BFS Runtime

1 2 4 8 9 5 10 11 3 6 12 13 7 14 15 Expected BFS search time is E[tBFS] = 2g − 1 + 1/p Proof. The position Y of the first goal is geometrically distributed with E[Y ] = 1/p.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 10 / 21

slide-14
SLIDE 14

DFS Runtime

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Expected DFS search time is E[tDFS] ≈ (1/p − 1)

  • number of

subtrees

2D−g+1

size of subtrees

  • Proof. There are (1/p − 1) red mini-

trees of size ≈ 2D−g+1. It turns out that the blue nodes do not substan- tially affect the count in most cases.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 11 / 21

slide-15
SLIDE 15

4 6 8 10 12 14 16 102 103 104 g expected search time DFS BFS Expected BFS and DFS search time as a function of goal depth in a tree

  • f depth D = 15, and goal probability p = 0.07.

The initially high expectation of BFS is because likely no goal exists whole tree searched (artefact of model).

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 12 / 21

slide-16
SLIDE 16

BFS vs. DFS

Combining the runtime estimates yields an elegant decision boundary for when BFS is better: E[tBFS] − E[tDFS] < 0

  • BFS Better

⇐ ⇒ g < D/2 + γ where γ = log2( 1−p

p )/2 is inversely related to p

(γ small when p not very close to 0 or 1). Observations: BFS is better when goal near start node (expected) DFS benefits when p is large

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 13 / 21

slide-17
SLIDE 17

BFS vs. DFS

4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 D g DFS wins BFS wins BFS=DFS E[tBFS] = E[tDFS]

Plot of BFS vs. DFS decision boundary with goal level g and goal probability p = 0.07. The decision boundary gets 79% of the winners correct.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 14 / 21

slide-18
SLIDE 18

BFS vs. DFS

4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 D g DFS wins BFS wins BFS=DFS E[tBFS] = E[tDFS]

Plot of BFS vs. DFS decision boundary with goal level g and goal probability p = 0.07. The decision boundary gets 79% of the winners correct. Time to generalise.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 14 / 21

slide-19
SLIDE 19

Tree with Multiple Goal Levels

As before, assume a complete tree with: D = 3, A maximum search depth D

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 15 / 21

slide-20
SLIDE 20

Tree with Multiple Goal Levels

As before, assume a complete tree with: D = 3, p = [0, 1

3, 1 3, 1 3]

A maximum search depth D Instead of goal level g and goal probability p: Use a goal probability vector p = [p0, . . . , pD]. Nodes on level k are goals with iid probability pk. This is arguably much more realistic :) ways to estimate the goal probabilities is an important future question.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 15 / 21

slide-21
SLIDE 21

Tree with Multiple Goal Levels

As before, assume a complete tree with: D = 3, p = [0, 1

3, 1 3, 1 3]

A maximum search depth D Instead of goal level g and goal probability p: Use a goal probability vector p = [p0, . . . , pD]. Nodes on level k are goals with iid probability pk. This is arguably much more realistic :) ways to estimate the goal probabilities is an important future question. Both BFS and DFS analysis can be carried back to the single goal level case with some hacks. BFS analysis is fairly straightforward DFS requires approximation of geometric distribution with exponential distribution

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 15 / 21

slide-22
SLIDE 22

Decision Boundary

10−2 10−1 100 101 102 6 8 10 12 14 σ2 µ DFS wins BFS wins tBFS

MGL=˜

tDFS

MGL

The goal probabilities are highest at a peak level µ, and decays around it depending on σ2. Some takeaways: BFS still likes goals close to the root BFS likes larger spread more than DFS does (increases probability of really easy goal)

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 16 / 21

slide-23
SLIDE 23

General graphs

BFS

1 2 4 8 9 5 10 11 3 6 12 13 7 14 15

DFS

1 2 3 4 5 7 6 8 13 9 10 11 14 12 15

We capture the various topological properties of graphs in a collection of parameters called the descendant counter. Similarly to before, we get approximate expressions for BFS and DFS expected runtime given a goal probability vector. We analytically derive the descendant counter for two concrete grammar problems (it could potentially be inferred empirically in other cases).

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 17 / 21

slide-24
SLIDE 24

One observation is that DFS can spend an even greater fraction of the initial search time far away from the root.

5 10 15 20 102 104 106 g Complete Binary Tree tBFS

SGL

˜ tDFS

SGL

5 10 15 20 102 104 106 g Binary Grammar tBFS

BG

˜ tDFS

BG

tDFS

BGL

tDFS

BGU

So BFS will be better for a wider range of goal levels in graph search than in tree search.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 18 / 21

slide-25
SLIDE 25

Experimental results

We randomly generate graphs according to a wide range of parameter settings. BFS always accurate. DFS in trees: Usually within 10% error; in some corner cases up to 50% error. DFS in binary grammar problem (non-tree graph): Mostly within 20% error; 35% at worst. More detailed results in paper.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 19 / 21

slide-26
SLIDE 26

Conclusions

With our model of goal distribution, we can predict expected search time

  • f BFS and DFS (instead of only worst-case), given goal probabilities for

all distances. Further work needed to automatically infer parameters. This theoretical understanding can hopefully be useful when: Choosing search method Constructing meta-heuristics Analysing performance of more complex search algorithms (for example, A* is a generalisation of BFS, and Beam Search is a generalisation of DFS). Choosing graph representation of search problem.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 20 / 21

slide-27
SLIDE 27

References

Chen, P. C. (1992). Heuristic Sampling: A Method for Predicting the Performance of Tree Searching Programs. SIAM Journal on Computing, 21(2):295–315. Everitt, T. and Hutter, M. (2015a). Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part I: Tree Search. In 28th Australian Joint Conference on Artificial Intelligence. Everitt, T. and Hutter, M. (2015b). Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part II: Graph Search. In 28th Australian Joint Conference on Artificial Intelligence. Kilby, P., Slaney, J., Thi´ ebaux, S., and Walsh, T. (2006). Estimating Search Tree Size. In Proc. of the 21st National Conf. of Artificial Intelligence, AAAI, Menlo Park. Knuth, D. E. (1975). Estimating the efficiency of backtrack programs. Mathematics of Computation, 29(129):122–122. Korf, R. E., Reid, M., and Edelkamp, S. (2001). Time complexity of iterative-deepening-A*. Artificial Intelligence, 129(1-2):199–218. Purdom, P. W. (1978). Tree Size by Partial Backtracking. SIAM Journal on Computing, 7(4):481–491. Rice, J. R. (1975). The algorithm selection problem. Advances in Computers, 15:65–117. Zahavi, U., Felner, A., Burch, N., and Holte, R. C. (2010). Predicting the performance

  • f IDA* using conditional distributions. Journal of Artificial Intelligence Research,

37:41–83.

Tom Everitt, Marcus Hutter (ANU) BFS vs. DFS September 3, 2015 21 / 21