X Example? In such cases, we can use local search algorithms Keep - PDF document

Today’s Class Local Search • Iterative improvement methods AI Class 6 (Ch. 4.1-4.2) • Hill climbing “If the path to the goal • Simulated annealing does not matter… [we • Local beam search can use] a single current node and move to • Genetic algorithms neighbors of that node.” • Online search – R&N pg. 121 Based on slides by Dr. Marie desJardin. Some material also adapted from slides by Dr. Matuszek @ Villanova University, which are based on Hwee Tou Ng at Berkeley, which Cynthia Matuszek – CMSC 671 are based on Russell at Berkeley. Some diagrams are based on AIMA. 3 Admissibility Admissibility and Optimality • Admissibility is a property of heuristics • Intuitively: • They are optimistic – think goal is closer than it is • When A* finds a path of length k , it has already tried • (Or, exactly right) every other path which can have length ≤ k • Because all frontier nodes have been sorted in ascending • Admissible algorithms order of f ( n )= g ( n )+ h ( n ) can be pretty bad! • Does an admissible heuristic guarantee optimality • Is h ( n ): “1 kilometer” admissible? for greedy search? • Using admissible heuristics guarantees that the first • Reminder: f ( n ) = h ( n ), always choose node “nearest” goal solution found will be optimal, for some algorithms (A*). • No sorting beyond that 4 5 Local Search Algorithms Local Search Algorithms Very efficient! • Sometimes the path to the goal is irrelevant • Sometimes the path to the goal is irrelevant • Goal state itself is the solution • Goal state itself is the solution Why? • an objective function to evaluate states E • an objective function to evaluate states E • In such cases, we can use local search algorithms • State space = set of “complete” configurations • That is, all elements of a solution are present • Keep a single “current” state, try to improve it • Find configuration satisfying constraints X • Example? • In such cases, we can use local search algorithms • Keep a single “current” state, try to improve it 6 7 1

State Space (Landscape) State Space (Landscape) S 2 A 1 B 4 B S A 8 Iterative Improvement Search Hill Climbing on State Surface Concept: • Start with an initial guess • trying to reach the “highest” • Gradually improve it until it is legal or optimal (most desirable) • Some examples: point (state) • Hill climbing • “Height” • Simulated annealing Defined by Evaluation • Constraint satisfaction Function 10 11 Hill Climbing Example Hill Climbing Search 2 8 3 1 2 3 start 1 6 4 8 4 h = 0 goal h = -4 • If there exists a successor s for the current state n such that 7 5 7 6 5 • h ( s ) > h ( n ) • h ( s ) >= h ( t ) for all the successors t of n , -2 -5 -5 then move from n to s . Otherwise, halt at n . 2 8 3 2 1 3 • Look one step ahead to determine if any successor is “better” than current state 1 4 h = -3 8 4 h = -1 • If so, move to the best successor 7 6 5 7 6 5 A kind of Greedy search in that it uses h • • But, does not allow backtracking or jumping to an alternative path -3 -4 • Doesn’t “remember” where it has been. 2 3 2 3 Not complete • 1 8 4 1 8 4 h = -2 • Search will terminate at local minima, plateaux, ridges. 7 6 5 7 6 5 h = -3 -4 12 f(n) = -(number of tiles out of place) 2

Exploring the Landscape Drawbacks of Hill Climbing local maximum • Local Maxima : • Problems: local maxima, plateaus, ridges • Peaks that aren’t the highest point in the space • Remedies: plateau • Plateaus: • Random restart: keep restarting the search from random • A broad flat region that gives locations until a goal is found. the search algorithm no direction (random walk) • Problem reformulation: reformulate the search space to eliminate these problematic features • Ridges: • Flat like a plateau, but with • Some problem spaces are great for hill climbing; drop-offs to the sides; steps ridge to the North, East, South others are terrible and West may go down, but a step to the NW may go up. Image from: http://classes.yale.edu/fractals/CA/GA/Fitness/Fitness.html 15 Example of a Local Optimum Some Extensions of Hill Climbing • Simulated Annealing 1 2 5 • Escape local maxima by allowing some “bad” moves but f = -7 7 4 gradually decreasing their frequency move start goal up 8 6 3 Local Beam Search • 1 2 5 1 2 3 • Keep track of k states rather than just one 8 7 4 8 4 f = 0 • At each iteration: move 6 3 7 6 5 right • All successors of the k states are generated and evaluated f = -6 1 2 5 • Best k are chosen for the next iteration 8 7 4 f = -7 6 3 f = -(manhattan distance) 16 17 Some Extensions of Hill Climbing Gradient Ascent / Descent • Take downward “steps” proportional to the negative of the • Stochastic Beam Search gradient of the function at current state. • Chooses semi-randomly from “uphill” possibilities • “Steepest descent” • “Steeper” moves have a higher probability of being chosen • Gradient descent procedure for finding the arg x min f(x) • Random-Restart Climbing • choose initial x 0 randomly • Can actually be applied to any form of search • repeat • Pick random starting points until one leads to a solution • x i+1 � x i – � f’ (x i ) • until the sequence x 0 , x 1 , …, x i , x i+1 converges • Genetic Algorithms • Step size � (eta) is • Each successor is generated from two predecessor (parent) small (~0.1–0.05) states • Good for differentiable, continuous spaces 18 19 Images from http://en.wikipedia.org/wiki/Gradient_descent 3

Gradient Ascent / Descent Gradient Methods vs. Newton’s Method • A reminder of Newton’s method from Calculus: x i+1 � x i – � f’ (x i ) / f’’ (x i ) • Newton � s method uses 2 nd order information (the second derivative, or, curvature ) to take a more direct route to the minimum. Contour lines of a function (blue) • The second-order information • Gradient descent (green) is more expensive to compute, • Newton’s method (red) but converges more quickly. 20 Images from http://en.wikipedia.org/wiki/Gradient_descent Images from http://en.wikipedia.org/wiki/Newton's_method_in_optimization Simulated Annealing Simulated Annealing (II) • Simulated annealing (SA): analogy between the way • Can avoid becoming trapped at local minima. metal cools into a minimum-energy crystalline structure • Uses a random local search that: and the search for a minimum generally • Accepts changes that increase objective function f • In very hot metal, molecules can move fairly freely • As well as some that decrease it • But, they are slightly less likely to move out of a stable structure • As you slowly cool the metal, more molecules are “trapped” in • Uses a control parameter T freedom to place • By analogy with the original application make “bad” • Conceptually: Escape local maxima by allowing some • Is known as the system “ temperature ” moves “bad” (locally counterproductive) moves but gradually decreasing their frequency • T starts out high and gradually decreases toward 0 22 23 Simulated Annealing (IV) Local Beam Search • f ( s ) represents the quality of state n (high is good) • Begin with k random states • A “bad” move from A to B is accepted with a probability • k , instead of one, current state(s) P(move A � B ) ≈ e ( f (B) – f (A)) / T • Generate all successors of these states • (Note that f (B) – f (A) will be negative, so bad moves always have a relative probability less than one. Good moves, for which f (B) – f (A) is positive, have a • Keep the k best states relative probability greater than one.) • Temperature • Stochastic beam search • Higher temperature = more likely to make a “bad” move • Probability of keeping a state is a function of its heuristic • As T tends to zero, this probability tends to zero value • SA becomes more like hill climbing • If T is lowered slowly enough, SA is complete and admissible. • More likely to keep “better” successors • domain-specific � • sometimes hard to determine 27 4

X Example? In such cases, we can use local search algorithms Keep - PDF document

Todays Class Local Search Iterative improvement methods AI Class 6 (Ch. 4.1-4.2) Hill climbing If the path to the goal Simulated annealing does not matter [we Local beam search can use] a single current node and move to

Algorithms: Gradient Descent This classic greedy algorithm for minimization uses the negative of

CSCI 446: Artificial Intelligence Optimization and Neural Nets Instructors: Michele Van Dyne

IAML: Optimization Charles Sutton and Victor Lavrenko School of Informatics Semester 1 1 / 24

Machine Learning 2007: Lecture 8 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Approaching the sign problem by complexification Manuel Scherzer in collaboration with I.-O.

A Slide Rule and a Half Colin Tombeur The Conundrum In some of Charles N. Pickworths detailed

Minimum Stein Discrepancy Estimators Fran cois-Xavier Briol University of Cambridge & The

Multiscale Methods for Subsurface Flow Jrg Aarnes, KnutAndreas Lie, Stein Krogstad, and

Eigenfunctions and Approximation Methods Chris Williams School of Informatics, University of

exemplifi plified ed for r a proje ject ct to determi ermine ne the assuran rance ce durin

Revisiting the gravitational lensing with Gauss Bonnet theorem Hideki Asada (Hirosaki) Ishihara,

Gradient Estimation for Implicit Models with Steins Method Yingzhen Li Microsoft Research

Some key ideas, techniques, tools and applica5ons Random

Complex Fourth Moment Theorems Simon Campese RTG 2131 opening workshop, November 27, 2015

Stability of Stein kernels, moment maps and invariant measures Dan Mikulincer Weizmann Institute

dra$-gjessing-taps-minset-04 S. Gjessing, M. Welzl TAPS

Approximation of the conditional number of exceedances Han Liang Gan University of Melbourne

QL Conceptualization from theory to classroom David Deville Northern Arizona University October

Deliverable #3 Alex Spivey, Eli Miller, Mike Haeger, and Melina Koukoutchos May 18, 2017 System

ECED2200 Digital Circuits Finite State Machines 31/07/2012 Colin OFlynn - CC BY-SA 1

Lecture 10: Sequential Networks: Implementation (Review) CSE 140: Components and Design

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Digital Design Discussion: Logic Gates Subtractor with Simple and Complex Gates Low Fuel

Lecture 5-2: Sequential Circuit Design continued FSM design Design steps for FSM: Draw state

X Example? In such cases, we can use local search algorithms Keep - PDF document

Todays Class Local Search Iterative improvement methods AI Class 6 (Ch. 4.1-4.2) Hill climbing If the path to the goal Simulated annealing does not matter [we Local beam search can use] a single current node and move to

Algorithms: Gradient Descent This classic greedy algorithm for minimization uses the negative of

CSCI 446: Artificial Intelligence Optimization and Neural Nets Instructors: Michele Van Dyne

IAML: Optimization Charles Sutton and Victor Lavrenko School of Informatics Semester 1 1 / 24

Machine Learning 2007: Lecture 8 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Approaching the sign problem by complexification Manuel Scherzer in collaboration with I.-O.

A Slide Rule and a Half Colin Tombeur The Conundrum In some of Charles N. Pickworths detailed

Minimum Stein Discrepancy Estimators Fran cois-Xavier Briol University of Cambridge &amp; The

Multiscale Methods for Subsurface Flow Jrg Aarnes, KnutAndreas Lie, Stein Krogstad, and

Eigenfunctions and Approximation Methods Chris Williams School of Informatics, University of

exemplifi plified ed for r a proje ject ct to determi ermine ne the assuran rance ce durin

Revisiting the gravitational lensing with Gauss Bonnet theorem Hideki Asada (Hirosaki) Ishihara,

Gradient Estimation for Implicit Models with Steins Method Yingzhen Li Microsoft Research

Some key ideas, techniques, tools and applica5ons Random

Complex Fourth Moment Theorems Simon Campese RTG 2131 opening workshop, November 27, 2015

Stability of Stein kernels, moment maps and invariant measures Dan Mikulincer Weizmann Institute

dra$-gjessing-taps-minset-04 S. Gjessing, M. Welzl TAPS

Approximation of the conditional number of exceedances Han Liang Gan University of Melbourne

QL Conceptualization from theory to classroom David Deville Northern Arizona University October

Deliverable #3 Alex Spivey, Eli Miller, Mike Haeger, and Melina Koukoutchos May 18, 2017 System

ECED2200 Digital Circuits Finite State Machines 31/07/2012 Colin OFlynn - CC BY-SA 1

Lecture 10: Sequential Networks: Implementation (Review) CSE 140: Components and Design

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Digital Design Discussion: Logic Gates Subtractor with Simple and Complex Gates Low Fuel

Lecture 5-2: Sequential Circuit Design continued FSM design Design steps for FSM: Draw state

Minimum Stein Discrepancy Estimators Fran cois-Xavier Briol University of Cambridge & The