Stochastic Simulation Simulated annealing Bo Friis Nielsen - PowerPoint PPT Presentation

Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely

Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T

Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0

Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0 the distribution will degenerate

Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0 the distribution will degenerate to states with minimum energy

Simulated annealing Simulated annealing DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 • Alternatives: Stochastic gradient DTU 02443 – lecture 9 4

Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 • Alternatives: Stochastic gradient and several other DTU 02443 – lecture 9 4

Physical inspiration (with apologies) Physical inspiration (with apologies) DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, we ensure that the material ends in the ground state. DTU 02443 – lecture 9 5

Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, we ensure that the material ends in the ground state. This process is called annealing . DTU 02443 – lecture 9 5

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution f ( x, T ) = DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; can be found by integration, DTU 02443 – lecture 9 6

P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; can be found by integration, but our algorithms will not require it. DTU 02443 – lecture 9 6

Example energy potential Example energy potential 1.0 0.5 Potential U(x) 0.0 −0.5 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 State x DTU 02443 – lecture 9 7

Corresponding p.d.f., for T = 0 . 2 , 1 , 5 Corresponding p.d.f., for T = 0 . 2 , 1 , 5 10 8 6 p.d.f. 4 2 0 0.0 0.2 0.4 0.6 0.8 1.0 State x DTU 02443 – lecture 9 8

An algorithm for Simulated Annealing An algorithm for Simulated Annealing

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k .

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC,

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f.

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) .

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i .

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept.

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i )

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i ) for a symmetric proposal distribution

An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i ) for a symmetric proposal distribution (to keep the probabilistic interpreation)

1.0 0.8 0.6 x 0.4 0.2 0.0 −1.0 0.0 1.0 0 2000 4000 6000 8000 10000 U(x) Index DTU 02443 – lecture 9 10

Different issues Different issues DTU 02443 – lecture 9 11

Different issues Different issues • Try with different schemes for lowering the temperature DTU 02443 – lecture 9 11

Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions DTU 02443 – lecture 9 11

Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions • Different candidate generation algorithms DTU 02443 – lecture 9 11

Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions • Different candidate generation algorithms • Refine with local search DTU 02443 – lecture 9 11

Travelling salesman problem (TSP) Travelling salesman problem (TSP) DTU 02443 – lecture 9 12

Travelling salesman problem (TSP) Travelling salesman problem (TSP) A basic problem in combinatorial optimisation DTU 02443 – lecture 9 12

Stochastic Simulation Simulated annealing Bo Friis Nielsen - PowerPoint PPT Presentation

Stochastic Simulation Simulated annealing Bo Friis Nielsen Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk A general optimisation problem A general optimisation problem DTU

Simulated Annealing Simulated annealing is a probabilistic search algorithm. The

Simulated Annealing G5BAIM: Artificial Intelligence Methods Graham Kendall 15 Feb 09 1

Outline Convergence DM812 METAHEURISTICS Lecture 2 1. Simulated Annealing Simulated Annealing

CHAPTER V V CHAPTER Annealing by Stochastic Annealing by Stochastic Neural Networks for

Simulated quantum annealing of double- Simulated quantum annealing of double- well and multiwell

What Is the Optimal Which Annealing . . . Annealing Schedule in Physical Meaning of . . . Need

Simulated Annealing Key idea: Vary temperature parameter, i.e. , probability of accepting

Simulated Annealing Chad Germany

Simulated Annealing Key idea: Vary temperature parameter, i.e. , probability of accepting

Simulated Annealing November 27th, 2012 Biostatistics 615/815 - Lecture 20 Hyun Min Kang

On Simulated Annealing in EDA A tribute to Prof. C. L. Liu at ISPD 2012 Martin D.F. Wong

Lin-Kernighan Heuristic. Simulated Annealing Marco Chiarandini Outline 1. Competition 2.

Informed search algorithms & Hill-climbing & Simulated annealing Chapter 4 Chapter 4 1

A Practical Approach to Quantum Annealing GOTO CHICAGO 2020 AGENDA Practical Quantum Annealing

Advanced Search Simulated annealing Yingyu Liang yliang@cs.wisc.edu Computer Sciences

CS137: Electronic Design Automation Day 10: February 6, 2002 Placement (Simulated

SelfSplit automatic workload distribution in parallel computation Matteo Fischetti, Michele

Implementation exercises for the course Heuristic Optimization Dr. Manuel L opez-Ib a

Chaining Test Cases for Reactive System Testing Peter Schrammel, Tom Melham and Daniel Kroening

Fundamentals of Computer Science II Keith Vertanen Museum 102 kvertanen@mtech.edu

Sample Problems Independent Set CSE 421 Graph G = (V, E), a subset S of the vertices is

Survey of Computer Science Computability & The Halting Problem Kevin Walsh

CSCI-UA.0380-001 Programming Challenges Sean McIntyre Class 04: Greedy and Binary Search

What do we mean by scalability? Qualitative: Will this software work well for substantially

Stochastic Simulation Simulated annealing Bo Friis Nielsen - PowerPoint PPT Presentation

Stochastic Simulation Simulated annealing Bo Friis Nielsen Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk A general optimisation problem A general optimisation problem DTU

Simulated Annealing Simulated annealing is a probabilistic search algorithm. The

Simulated Annealing G5BAIM: Artificial Intelligence Methods Graham Kendall 15 Feb 09 1

Outline Convergence DM812 METAHEURISTICS Lecture 2 1. Simulated Annealing Simulated Annealing

CHAPTER V V CHAPTER Annealing by Stochastic Annealing by Stochastic Neural Networks for

Simulated quantum annealing of double- Simulated quantum annealing of double- well and multiwell

What Is the Optimal Which Annealing . . . Annealing Schedule in Physical Meaning of . . . Need

Simulated Annealing Key idea: Vary temperature parameter, i.e. , probability of accepting

Simulated Annealing Chad Germany

Simulated Annealing Key idea: Vary temperature parameter, i.e. , probability of accepting

Simulated Annealing November 27th, 2012 Biostatistics 615/815 - Lecture 20 Hyun Min Kang

On Simulated Annealing in EDA A tribute to Prof. C. L. Liu at ISPD 2012 Martin D.F. Wong

Lin-Kernighan Heuristic. Simulated Annealing Marco Chiarandini Outline 1. Competition 2.

Informed search algorithms &amp; Hill-climbing &amp; Simulated annealing Chapter 4 Chapter 4 1

A Practical Approach to Quantum Annealing GOTO CHICAGO 2020 AGENDA Practical Quantum Annealing

Advanced Search Simulated annealing Yingyu Liang yliang@cs.wisc.edu Computer Sciences

CS137: Electronic Design Automation Day 10: February 6, 2002 Placement (Simulated

SelfSplit automatic workload distribution in parallel computation Matteo Fischetti, Michele

Implementation exercises for the course Heuristic Optimization Dr. Manuel L opez-Ib a

Chaining Test Cases for Reactive System Testing Peter Schrammel, Tom Melham and Daniel Kroening

Fundamentals of Computer Science II Keith Vertanen Museum 102 kvertanen@mtech.edu

Sample Problems Independent Set CSE 421 Graph G = (V, E), a subset S of the vertices is

Survey of Computer Science Computability &amp; The Halting Problem Kevin Walsh

CSCI-UA.0380-001 Programming Challenges Sean McIntyre Class 04: Greedy and Binary Search

What do we mean by scalability? Qualitative: Will this software work well for substantially

Informed search algorithms & Hill-climbing & Simulated annealing Chapter 4 Chapter 4 1

Survey of Computer Science Computability & The Halting Problem Kevin Walsh