Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 - PowerPoint PPT Presentation

Model Based Metaheuristics Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 1. Model Based Metaheuristics Cross Entropy Method Cross Entropy Method Continuous Optimization 2. Continuous Optimization Marco Chiarandini Numerical Analysis Department of Mathematics and Computer Science University of Southern Denmark, Odense, Denmark <marco@imada.sdu.dk> Model Based Metaheuristics Model Based Metaheuristics Outline CEM Cross Entropy Method CEM Continuous Optimization Continuous Optimization Key idea: use rare event-simulation and importance sampling to proceed 1. Model Based Metaheuristics towards good solutions Cross Entropy Method generate random solution samples according to a specified mechanism update the parameters of the random mechanism to produce a 2. Continuous Optimization better “sample” Numerical Analysis

Model Based Metaheuristics Model Based Metaheuristics CE for Optimization CEM Estimation CEM Continuous Optimization Continuous Optimization � I { f ( s ) ≥ γ } p ( s, θ ′ ) = E θ ′ � � Notation: ℓ = I { f ( s ) ≥ γ } s S finite set of states Monte-Carlo simulation: f real valued performance functions on S draw a random sample max s ∈S f ( s ) = γ ∗ = f ( s ∗ ) (our problem) � N compute unbiased estimator of ℓ : ˆ ℓ = 1 i =1 I { f ( s i ) ≥ γ } N { p ( s, θ ) | θ ∈ Θ } family of discrete probability mass function on if probability to sample I { f ( s i ) ≥ γ } the estimation is not accurate s ∈ S E θ [ f ( s )] = � s ∈S f ( s ) p ( s, θ ) Importance sampling: use a different probability function g on S to sample the solutions We are interested in the probability that f ( s ) is greater than some ℓ = � � � threshold γ under the probability p ( · , θ ∗ ) : s I { f ( s ) ≥ γ } p ( s, θ ′ ) I { f ( s ) ≥ γ } p ( s, θ ′ ) g ( s ) g ( s ) = E g g ( s ) � I { f ( s ) ≥ γ } p ( s, θ ′ ) = E θ ′ � � compute unbiased estimator of ℓ : ℓ = Pr( f ( s ) ≥ γ ) = I { f ( s ) ≥ γ } s N � I { f ( s i ) ≥ γ } p ( s, θ ′ ) ℓ = 1 ˆ if this probability is very small then we call { f ( s ) ≥ γ } a rare event N g ( s ) i =1 Model Based Metaheuristics Model Based Metaheuristics CEM CEM Continuous Optimization Continuous Optimization How to determine g ? Generalizing to probability density functions and Lebesque integrals Best choice would be: � � min D ( g ∗ , g ) = min g ∗ ( s ) ln g ∗ ( s ) ds − g ∗ ( s ) ln g ( s, θ ) ds g ∗ ( s ) := I { f ( s ) ≥ γ } p ( x, θ ′ ) θ , l Minimizing the distance by means of sampling estimation leads to: � N as substituting ˆ i =1 I { f ( s i ) ≥ γ } p ( s, θ ′ ) ℓ = 1 g ∗ ( s ) = ℓ . N θ = argmax θ E θ ′′ I { f ( s i ) ≥ γ } p ( s, θ ′ ) But ℓ is unknwon. � p ( s, θ ′′ ) ln p ( s, θ ) It is convinient to choose g from { p ( · , θ ) } stochastic program (convex). In some cases can be solved in closed form (eg, exponential, Choose the parameter θ such that the difference of g = p ( · , θ ) to g ∗ Bernoulli). is minimal Same result can be obtained by maximum likelihood estimation over Cross entropy or Kullback Leibler distance, measure of the distance the solutions s i with performance ≥ γ between two probability distribution functions, � � N � ln g ∗ ( s ) L = max p ( s i , θ ) D ( g ∗ , g ) = E g ∗ θ g ( s ) i =1

Model Based Metaheuristics Model Based Metaheuristics CEM CEM Continuous Optimization Continuous Optimization Cross Entropy Method (CEM): Estimation via stochastic counterpart: Define � N θ 0 . Set t = 1 � I { f ( s i ) ≥ γ ) } p ( s i , θ ′ ) 1 � θ = argmax θ p ( s i , θ ′′ ) ln p ( s i , θ ) N while termination criterion is not satisfied do i =1 generate a sample ( s 1 , s 2 , . . . s N ) from the pdf p ( · ; ˆ θ t − 1 ) where s 1 , . . . , s N is a random sample from p ( · , θ ′′ ) . set � γ t equal to the (1 − ρ ) -quantile with respect to f γ t = s ( ⌈ (1 − ρ ) N ⌉ ) ) But still problems with sampling due to rare events. ( � Solution: Two-phase iterative approach: use the same sample ( s 1 , s 2 , . . . , s N ) to solve the stochastic program construct a sequence of levels b γ 1 , b γ 2 , . . . , b γ t construct a sequence of parameters b θ 1 , b θ 2 , . . . , b � N θ t � 1 θ t = arg max γ t } ln p ( s i ; θ ) I { f ( s i ) ≤ b N such that � γ t is close to optimal v i =1 and � θ t assigns maximal probability to sample high quality solutions Model Based Metaheuristics Model Based Metaheuristics CEM CEM Continuous Optimization Continuous Optimization Example: TSP Solution representation: permutation representation Termination criterion : if for some t ≥ d with, e.g. , d = 5 , Probabilistic model: matrix P where p ij represents probability of γ t = � γ t − 1 = . . . = � γ t − d � vertex j after vertex i Smoothed Updating : � θ t = α � θ ′ t + (1 − α ) � θ t − 1 with 0 . 4 ≤ α ≤ 0 . 9 Tour construction: specific for tours θ ′ t from the stochastic counterpart Define P (1) = P and X 1 = 1 . Let k = 1 Parameters : while k < n − 1 do obtain P ( k +1) from P ( k ) by setting the X k -th column of P ( k ) N = cn , n size of the problem (number of choices available for each solution component to decide) to zero and normalizing the rows to sum up to 1. c > 1 ( 5 ≤ c ≤ 10 ); Generate X k +1 from the distribution formed by the X k -th row of ρ ≈ 0 . 01 for n ≥ 100 and ρ ≈ ln( n ) /n for n < 100 P ( k ) set k = k + 1 Update: take the fraction of times transition i to j occurred in those paths the cycles that have f ( s ) ≤ γ

Model Based Metaheuristics Model Based Metaheuristics Outline Numerical Analysis Continuous Optimization Numerical Analysis Continuous Optimization Continuous Optimization We look at unconstrained optimization of continuous, non-linear, 1. Model Based Metaheuristics non-convex, non-differentiable functions Cross Entropy Method Many applications above all in statistical estimation, (eg, likelihood estimation) 2. Continuous Optimization Numerical Analysis Typically few variables (curse of dimensionality) Smooth Functions Model Based Metaheuristics Model Based Metaheuristics Standard Test Functions Numerical Analysis Numerical Analysis Continuous Optimization Continuous Optimization Differentiable Gradient Descent f ( x ) decreases Rosenbrock’s banana function fastest moving in the direction of f ( x, y ) = (1 − x ) 2 + 100( y − x 2 ) 2 the negative gradient of f Hence, Global minimum at ( x, y ) = (1 , 1) where f ( x, y ) = 0 x n +1 = x n − γ ∇ f ( x n ) converges for appropriate x 0 and for Multidimensional extension is γ n > 0 small enough numbers. N − 1 � � i ) 2 � (1 − x i ) 2 + 100( x i +1 − x 2 ∀ x ∈ R N . f ( x ) = Problem is choosing γ i =1 Global minimum at ( x 1 , . . . , x N ) = (1 , . . . , 1) Secant Method If only one-dimension and f hard to Rastrigin’s differentiate: Schwefel’s x n − x n − 1 Sphere x n +1 = x n − f ( x n ) − f ( x n − 1 ) f ( x n ) . Continue at: http://www.cs.bham.ac.uk/research/projects/ecb/

Smooth functions Model Based Metaheuristics Model Based Metaheuristics Numerical Analysis Numerical Analysis Continuous Optimization Continuous Optimization Twice differentiable Newton’s method in one dimension Taylor expansion of f ( x ) : f ( x + ∆ x ) = f ( x ) + f ′ ( x )∆ x + 1 2 f ′′ ( x )∆ x 2 , attains its extremum when ∆ x solves the linear equation: f ′ ( x ) + f ′′ ( x )∆ x = 0 and f ′′ ( x ) > 0 Hence, if x 0 is chosen appropriately, the sequence below converges to x ∗ x n +1 = x n − f ′ ( x n ) f ′′ ( x n ) , n ≥ 0 Newton’s method generalized to several dimensions first derivative ← − gradient ∇ f ( x ) , reciprocal of the second derivative ← − inverse of Hessian matrix, Hf ( x ) x n +1 = x n − [ Hf ( x n )] − 1 ∇ f ( x n ) , n ≥ 0 . Model Based Metaheuristics Numerical Analysis Continuous Optimization Newton’s method converges much faster towards a local maximum or minimum than gradient descent. However, finding the inverse of the Hessian may be an expensive operation, so approximations may be used instead Quasi-Newton methods Conjugate Gradient [Fletcher and Reeves (1964)] BFGS (variable metric algorithm) [Broyden, Fletcher, Goldfarb and Shanno (1970)]

Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 - PowerPoint PPT Presentation

Model Based Metaheuristics Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 1. Model Based Metaheuristics Cross Entropy Method Cross Entropy Method Continuous Optimization 2. Continuous Optimization Marco Chiarandini

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

CoFi-points: Collaborative Filtering via Pointwise Preference Learning on User/Item-Set Lin Li 1 ,

Prcision p -adique Journes Nationales du Calcul Formel 2014 X.Caruso, P.Lairez, D.Roe,

Lecture 3 Gaussian Mixture Models and Introduction to HMMs Michael Picheny, Bhuvana

Statistical Models & Computing Methods Lecture 1: Introduction Cheng Zhang School of

Model Reduction for Reaction-Diffusion Systems: Bifurcations in Slow Invariant Manifolds Joshua

rtt s t

IMNS- 2014 Syzygies of GS monomial curves and smoothability. Grazia Tamone Dima - University of

AP BIOLOGY Investigation #2 Mathematical Modeling: Hardy-Weinberg Summer 2014 www.njctl.org

Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 - PowerPoint PPT Presentation

Model Based Metaheuristics Outline Continuous Optimization DM812 METAHEURISTICS Lecture 12 1. Model Based Metaheuristics Cross Entropy Method Cross Entropy Method Continuous Optimization 2. Continuous Optimization Marco Chiarandini

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

CoFi-points: Collaborative Filtering via Pointwise Preference Learning on User/Item-Set Lin Li 1 ,

Prcision p -adique Journes Nationales du Calcul Formel 2014 X.Caruso, P.Lairez, D.Roe,

Lecture 3 Gaussian Mixture Models and Introduction to HMMs Michael Picheny, Bhuvana

Statistical Models &amp; Computing Methods Lecture 1: Introduction Cheng Zhang School of

Model Reduction for Reaction-Diffusion Systems: Bifurcations in Slow Invariant Manifolds Joshua

rtt s t

IMNS- 2014 Syzygies of GS monomial curves and smoothability. Grazia Tamone Dima - University of

AP BIOLOGY Investigation #2 Mathematical Modeling: Hardy-Weinberg Summer 2014 www.njctl.org

Statistical Models & Computing Methods Lecture 1: Introduction Cheng Zhang School of