 
              Iterative improvement algorithms In many optimization problems, path is irrelevant; the goal state itself is the solution Then state space = set of “complete” configurations; Beyond Classical Search find optimal configuration, e.g., TSP or, find configuration satisfying constraints, e.g., timetable In such cases, can use iterative improvement algorithms; keep a single “current” state, try to improve it Chapter 4, Sections 4.1-4.2 Constant space, suitable for online as well as offline search Chapter 4, Sections 4.1-4.2 1 Chapter 4, Sections 4.1-4.2 3 Outline Example: Traveling Salesperson Problem ♦ Hill-climbing Start with any complete tour, perform pairwise exchanges ♦ Simulated annealing ♦ Genetic algorithms (briefly) ♦ Local search in continuous spaces (briefly) Variants of this approach get within 1% of optimal very quickly with thou- sands of cities Chapter 4, Sections 4.1-4.2 2 Chapter 4, Sections 4.1-4.2 4
Example: n -queens Hill-climbing contd. Put n queens on an n × n board with no two queens on the same Useful to consider state space landscape row, column, or diagonal objective function global maximum Move a queen to reduce number of conflicts shoulder local maximum "flat" local maximum state space current h = 5 h = 2 h = 0 state Random-restart hill climbing overcomes local maxima (eventually a good Almost always solves n -queens problems almost instantaneously initial state) for very large n , e.g., n = 1 million Random sideways moves escape from shoulders loop on flat maxima Chapter 4, Sections 4.1-4.2 5 Chapter 4, Sections 4.1-4.2 7 Ridges Hill-climbing (or gradient ascent/descent) “Like climbing Everest in thick fog with amnesia” function Hill-Climbing ( problem ) returns a state that is a local maximum inputs : problem , a problem local variables : current , a node neighbor , a node current ← Make-Node ( Initial-State [ problem ]) loop do neighbor ← a highest-valued successor of current if Value [neighbor] ≤ Value [current] then return State [ current ] current ← neighbor end Chapter 4, Sections 4.1-4.2 6 Chapter 4, Sections 4.1-4.2 8
Simulated annealing Local beam search Idea: k random initial states; choose and keep top k of all their successors Idea: escape local maxima by allowing some “bad” moves but gradually decrease their size and frequency ♦ Not the same as k hill climbing searches run in parallel! function Simulated-Annealing ( problem, schedule ) returns a solution state ♦ Searches that find good states recruit other searches to join them inputs : problem , a problem schedule , a mapping from time to “temperature” ♦ However, if the successors from an initial state are not selected, the local variables : current , a node search starting from that state is effectively abandoned. next , a node T , a “temperature” controlling prob. of downward steps Problem: quite often, all k states end up on same local hill current ← Make-Node ( Initial-State [ problem ]) Idea: ? for t ← 1 to ∞ do T ← schedule [ t ] if T = 0 then return current next ← a randomly selected successor of current ∆ E ← Value [ next ] – Value [ current ] if ∆ E > 0 then current ← next else current ← next only with probability e ∆ E/T Chapter 4, Sections 4.1-4.2 9 Chapter 4, Sections 4.1-4.2 11 Properties of simulated annealing Local Beam Search Idea: k random initial states; choose and keep top k of all their successors At fixed “temperature” T , state occupation probability reaches Boltzman distribution ♦ Not the same as k hill climbing searches run in parallel! E ( x ) p ( x ) = αe kT ♦ Searches that find good states recruit other searches to join them ⇒ always reach best state x ∗ T decreased slowly enough = ♦ However, if the successors from an initial state are not selected, the E ( x ∗ ) E ( x ∗ ) − E ( x ) E ( x ) kT /e kT = e because e ≫ 1 for small T kT search starting from that state is effectively abandoned. Is this necessarily an interesting guarantee?? Problem: quite often, all k states end up on same local hill Devised by Metropolis et al., 1953, for physical process modelling Idea: choose k successors randomly, biased towards good ones (Stochastic Beam Search) Widely used in VLSI layout, airline scheduling, etc. Observe the close analogy to natural selection! Chapter 4, Sections 4.1-4.2 10 Chapter 4, Sections 4.1-4.2 12
Genetic algorithms Continuous state spaces = stochastic beam search + generate successors from pairs of states ♦ Suppose we want to site three airports in Romania: – 6-D state space defined by ( x 1 , y 2 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) 24748552 32748552 32748152 32752411 24 31% – objective function f ( x 1 , y 2 , x 2 , y 2 , x 3 , y 3 ) = sum of squared distances from each city to nearest airport 24752411 24752411 32752411 24748552 23 29% 32752124 32252124 24415124 32752411 20 26% 32543213 24415124 24415411 24415417 11 14% Fitness Selection Pairs Cross−Over Mutation Chapter 4, Sections 4.1-4.2 13 Chapter 4, Sections 4.1-4.2 15 Genetic algorithms contd. Continuous state spaces–Discretization GAs require states encoded as strings (GPs use programs) ♦ Suppose we want to site three airports in Romania: – 6-D state space defined by ( x 1 , y 2 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) Crossover helps iff substrings are meaningful components – objective function f ( x 1 , y 2 , x 2 , y 2 , x 3 , y 3 ) = sum of squared distances from each city to nearest airport ♦ Discretization methods turn continuous space into discrete space + = ♦ each state has six discrete variables (e.g. ± δ miles, where δ is a constant) [or grid cells] ♦ each state has how many possible successors? GAs � = evolution: e.g., real genes encode replication machinery! Chapter 4, Sections 4.1-4.2 14 Chapter 4, Sections 4.1-4.2 16
Continuous state spaces–Discretization Contrast and Summary ♦ Suppose we want to site three airports in Romania: ♦ Ch. 3 – 6-D state space defined by ( x 1 , y 2 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) ♦ Ch. 4.1-2 – objective function f ( x 1 , y 2 , x 2 , y 2 , x 3 , y 3 ) = sum of squared distances from each city to nearest airport ♦ What is the key difference? ♦ Discretization methods turn continuous space into discrete space ♦ each state has six discrete variables (e.g. ± δ miles, where δ is a constant) [or grid cells] ♦ each state has how many possible successors? • 12 [book] (action: change only one variable—x or (“xor”) y of one airport) • 3 6 − 1 (action: change at least one variable) ♦ what is the algorithm? Chapter 4, Sections 4.1-4.2 17 Chapter 4, Sections 4.1-4.2 19 Continuous state spaces–No Discretization Contrast and Summary ♦ Gradient (of the objective function) methods compute ♦ Ch. 3: “It is the journey, not the destination.” (optimize the path)  ∂f , ∂f , ∂f , ∂f , ∂f , ∂f   ♦ Ch. 4.1-2: “It is the destination, not the journey” (optimize the goal) ∇ f =     ∂x 1 ∂y 1 ∂x 2 ∂y 2 ∂x 3 ∂y 3  ♦ Different problem formulation, do we still need: ♦ To increase/reduce f , e.g., by x ← x + α ∇ f ( x ) • Initial state (state space): ? ♦ Sometimes can solve for ∇ f ( x ) = 0 exactly (e.g., only one airport). • Successor function (actions): ? ♦ Otherwise, Newton–Raphson (1664, 1690) iterates x ← x − H − 1 f ( x ) ∇ f ( x ) • Step (path) cost: ? to solve ∇ f ( x ) = 0 , where H ij = ∂ 2 f/∂x i ∂x j • Goal test: ? Chapter 4, Sections 4.1-4.2 18 Chapter 4, Sections 4.1-4.2 20
Contrast and Summary Searching with Non-deterministic Actions ♦ Ch. 3: “It is the journey, not the destination.” (optimize the path) ♦ performing an action might not yield the expected successor state ♦ Ch. 4.1-2: “It is the destination, not the journey” (optimize the goal) ♦ Suck can clean one dirty square, but sometimes an adjacent dirty square as well ♦ Different problem formulation, do we still need: ♦ Suck on a clean square can sometimes make it dirty • Initial state (state space): yes [but different kind of states] • Successor function (actions): yes [but different kind of actions] • Step (path) cost: no [not the journey] • Goal test: no [optimize objective function] ♦ The n-queen and TSP problems can be forumluated in either way, how? Chapter 4, Sections 4.1-4.2 21 Chapter 4, Sections 4.1-4.2 23 Skipping the rest Erratic Vacuum World 1 2 3 4 5 6 7 8 ♦ not just a sequence of actions, but backup/contingency plans ♦ from State 1: [Suck, if State = 5 then [Right, Suck] else [] ] Chapter 4, Sections 4.1-4.2 22 Chapter 4, Sections 4.1-4.2 24
And-Or Search Tree Sensorless problems 1 ♦ No sensor—the agent does not know which state it is in Suck Right ♦ Is it hopeless? 7 5 2 GOAL Suck Right Left Suck 5 1 6 1 8 4 Suck Left LOOP LOOP LOOP GOAL 8 5 GOAL LOOP ♦ every path reaches a goal, a repeated state, or a dead end Chapter 4, Sections 4.1-4.2 25 Chapter 4, Sections 4.1-4.2 27 Slippery floor Belief States 1 ♦ Each “belief” state is a collection of possible “physical” states. Suck Right L R L R 5 2 Right S S S 6 L R R L ♦ S S R L L R S S R L ♦ 12 “reachable” belief states (out of 255 possible belief states) ♦ If the actions have uncertain outcomes, how many belief states are there? Chapter 4, Sections 4.1-4.2 26 Chapter 4, Sections 4.1-4.2 28
Recommend
More recommend