adaptive algorithmic behavior for solving mixed integer
play

Adaptive Algorithmic Behavior for solving Mixed Integer Programs - PowerPoint PPT Presentation

Gregor Hendel Matthias Miltenberger Jakob Witzig Gregor Hendel, Matthias Miltenberger, Jakob Witzig Adaptive algorithms for MIP 1/26 Adaptive Algorithmic Behavior for solving Mixed Integer Programs using Bandit Algorithms International


  1. Gregor Hendel Matthias Miltenberger Jakob Witzig Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 1/26 Adaptive Algorithmic Behavior for solving Mixed Integer Programs using Bandit Algorithms International Conference on O perations R esearch 2018, Sep 12, Brussels, Belgium

  2. Introduction Adaptive Large Neighborhood Search Adaptive LP Pricing Adaptive Diving Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 2/26 Overview

  3. Introduction

  4. c T x s.t. (MIP) Solution method: • typically solved with branch-and-cut • at each node, an LP relaxation is (re-)solved with the dual Simplex algorithm • primal heuristics, e.g., Large Neighborhood Search and diving methods, support the solution process Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 3/26 Mixed Integer Programs min Ax ≥ b ℓ ≤ x ≤ u x ∈ { 0 , 1 } n b × Z n i − n b × Q n − n i

  5. • stochastic i.i.d. rewards for each action over time • adversarial an opponent tries to maximize the player’s regret. Two main scenarios: Literature: [Bubeck and Cesa-Bianchi, 2012] Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 4/26 The Multi-Armed Bandit Problem • Discrete time steps t = 1 , 2 , . . . • Finite set of actions H 1. Choose h t ∈ H 2. Observe reward r ( h t , t ) ∈ [ 0 , 1 ] 3. Goal: Maximize ∑ t r ( h t , t )

  6. • adversarial an opponent tries to maximize the player’s regret. Two main scenarios: Literature: [Bubeck and Cesa-Bianchi, 2012] Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 4/26 The Multi-Armed Bandit Problem • Discrete time steps t = 1 , 2 , . . . • Finite set of actions H 1. Choose h t ∈ H 2. Observe reward r ( h t , t ) ∈ [ 0 , 1 ] 3. Goal: Maximize ∑ t r ( h t , t ) • stochastic i.i.d. rewards for each action over time

  7. Literature: [Bubeck and Cesa-Bianchi, 2012] Two main scenarios: Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 4/26 The Multi-Armed Bandit Problem • Discrete time steps t = 1 , 2 , . . . • Finite set of actions H 1. Choose h t ∈ H 2. Observe reward r ( h t , t ) ∈ [ 0 , 1 ] 3. Goal: Maximize ∑ t r ( h t , t ) • stochastic i.i.d. rewards for each action over time • adversarial an opponent tries to maximize the player’s regret.

  8. Upper Confidence Bound (UCB) r h t T h t Exp.3 p h t 1 if t , H t if t 5/26 t w h t h w h t 1 Individual parameters 0 can be calibrated to the problem at hand. Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 1 1 1 1 h h t and Bandit Algorithms ¯ Let T h ( t ) = ∑ r h ( t ) = ∑ 1 h = h t r h , t 1 h = h t T h ( t ) t ′ ≤ t t ′ ≤ t ε -greedy √ |H| Select heuristic at random with probability ε t = ε t , otherwise use best.

  9. Exp.3 5/26 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 0 can be calibrated to the problem at hand. and Individual parameters 1 1 t w h h w h t 1 p h t Bandit Algorithms ¯ Let T h ( t ) = ∑ r h ( t ) = ∑ 1 h = h t r h , t 1 h = h t T h ( t ) t ′ ≤ t t ′ ≤ t ε -greedy √ |H| Select heuristic at random with probability ε t = ε t , otherwise use best. Upper Confidence Bound (UCB) { √ }  α ln( 1 + t ) ¯ argmax r h ( t − 1 ) + if t > |H| ,  T h ( t − 1 ) h t ∈ h ∈H { H t } if t ≤ |H| . 

  10. 5/26 1 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 0 can be calibrated to the problem at hand. and Individual parameters 1 Bandit Algorithms ¯ Let T h ( t ) = ∑ r h ( t ) = ∑ 1 h = h t r h , t 1 h = h t T h ( t ) t ′ ≤ t t ′ ≤ t ε -greedy √ |H| Select heuristic at random with probability ε t = ε t , otherwise use best. Upper Confidence Bound (UCB) { √ }  α ln( 1 + t ) ¯ argmax r h ( t − 1 ) + if t > |H| ,  T h ( t − 1 ) h t ∈ h ∈H { H t } if t ≤ |H| .  Exp.3 exp( w h , t ) p h , t = ( 1 − γ ) · h ′ exp( w h ′ , t ) + γ · ∑ |H|

  11. 5/26 1 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 1 and Bandit Algorithms ¯ Let T h ( t ) = ∑ r h ( t ) = ∑ 1 h = h t r h , t 1 h = h t T h ( t ) t ′ ≤ t t ′ ≤ t ε -greedy √ |H| Select heuristic at random with probability ε t = ε t , otherwise use best. Upper Confidence Bound (UCB) { √ }  α ln( 1 + t ) ¯ argmax r h ( t − 1 ) + if t > |H| ,  T h ( t − 1 ) h t ∈ h ∈H { H t } if t ≤ |H| .  Exp.3 exp( w h , t ) p h , t = ( 1 − γ ) · h ′ exp( w h ′ , t ) + γ · ∑ |H| Individual parameters α, ε, γ ≥ 0 can be calibrated to the problem at hand.

  12. Adaptive Large Neighborhood Search

  13. Large Neighborhood Search (LNS) heuristics solve auxiliary MIPs and can be c T distinguished by their respective neighborhoods. Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 6/26 LNS and the auxiliary MIP Auxiliary MIP Let P be a MIP with solution set F P . For a polyhedron N ⊆ Q n and objective coeffjcients c aux ∈ Q n , a MIP P aux defined as { } aux x | x ∈ F P ∩ N min is called an auxiliary MIP of P , and N is called neighborhood.

  14. • Relaxation Induced Neighborhood Search (RINS) [Danna et al., 2005] • Local Branching [Fischetti and Lodi, 2003] • Crossover, Mutation [Rothberg, 2007] • Proximity [Fischetti and Monaci, 2014] • Zeroobjective [in SCIP, Gurobi, XPress,…] • Analytic Center Search [Berthold et al., 2017] • … Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 7/26 Famous LNS Heuristics • RENS [Berthold, 2014] • DINS [Ghosh, 2007]

  15. r gap r sol Solution Reward r sol h t t Gap Reward r fail r gap h t t Failure Penalty r fail h t t h t t n h t 8/26 1 1 scaling (opt.) 2 2 1 Default settings in ALNS: 1 0 8 2 0 5 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 1 1 r alns n lim x new if x old 1 c dual c T x old c T x new c T x old , else 0 x new , if x old 1 Rewarding Neighborhoods Goal A suitable reward function r alns ( h t , t ) ∈ [ 0 , 1 ]

  16. r gap r sol Gap Reward r fail r gap h t t Failure Penalty r fail h t t h t t n h t 1 1 1 scaling (opt.) 8/26 2 1 2 Default settings in ALNS: 1 0 8 2 0 5 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP r alns x new n lim c T x new 1 0 1 c T x old , else c T x old c dual 1 if x old Rewarding Neighborhoods Goal A suitable reward function r alns ( h t , t ) ∈ [ 0 , 1 ] Solution Reward  , if x old ̸ = x new  r sol ( h t , t ) = 

  17. r gap r sol r fail Failure Penalty r fail h t t h t t n h t scaling (opt.) 1 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP 1 1 0 5 2 r alns 1 2 Default settings in ALNS: 1 0 8 2 8/26 n lim 1 1 0 , else if x old x new 1 Rewarding Neighborhoods Goal A suitable reward function r alns ( h t , t ) ∈ [ 0 , 1 ] Solution Reward  , if x old ̸ = x new  r sol ( h t , t ) =  Gap Reward r gap ( h t , t ) = c T x old − c T x new c T x old − c dual

  18. r gap r sol r fail 8/26 1 1 1 1 scaling (opt.) 2 2 n lim , else 0 Default settings in ALNS: 1 1 0 8 2 0 5 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP r alns Rewarding Neighborhoods Goal A suitable reward function r alns ( h t , t ) ∈ [ 0 , 1 ] Solution Reward  , if x old ̸ = x new  r sol ( h t , t ) =  Gap Reward r gap ( h t , t ) = c T x old − c T x new c T x old − c dual Failure Penalty  if x old ̸ = x new 1 ,  r fail ( h t , t ) = 1 − ϕ ( h t , t ) n ( h t ) 

  19. 8/26 1 Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP scaling (opt.) n lim , else 0 Rewarding Neighborhoods Goal A suitable reward function r alns ( h t , t ) ∈ [ 0 , 1 ] r gap ( . ) r sol ( . ) Solution Reward  , if x old ̸ = x new · ( 1 − η 1 ) · η 1  r sol ( h t , t ) = +  Gap Reward r gap ( h t , t ) = c T x old − c T x new r fail ( . ) c T x old − c dual · η 2 · ( 1 − η 2 ) + Failure Penalty  if x old ̸ = x new 1 ,  r fail ( h t , t ) = r alns ( . ) 1 − ϕ ( h t , t ) n ( h t )  Default settings in ALNS: η 1 = 0 . 8 , η 2 = 0 . 5

  20. 9/26 • Always execute all 8 neighborhoods with ALNS (disable old LNS heuristics) • Disable solution transfer • Record each reward Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP time limit. 666 instances from the test sets MIPLIB3, MIPLIB2003, MIPLIB2010, Cor@l, 5h Simulation for parameter calibration 750 Fixing rate 0.1 Instances 500 0.3 0.5 0.7 0.9 250 • Fixing rates 0 . 1 − 0 . 9 0 0 20 40 60 ALNS calls Test Set

  21. 10/26 (UCB) Gregor Hendel, Matthias Miltenberger, Jakob Witzig – Adaptive algorithms for MIP UCB Calibration Simulate 100 repetitions of UCB, Exp.3, and ϵ -greedy on the data ● ● 0.55 UCB ● alpha_0 ● ● alpha_0.2 0.50 alpha_0.4 Sol. rate alpha_0.6 alpha_0.8 ● alpha_1 0.45 alpha_0.0016 avg 0.40 0.1 0.3 0.5 0.7 0.9 Fixing rate { √ }  α ln( 1 + t ) ¯ argmax r h ( t − 1 ) + if t > |H| ,  Th ( t − 1 ) h t ∈ h ∈H { H t } if t ≤ |H| . 

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend