Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, - PowerPoint PPT Presentation

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Ll Lluí uís-Mi Miquel Mu

Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic Mixed-Integer programs • Fully-featured MIP solver for any generic 2-stage Stochastic MIP. ... • Two levels of nested parallelism (B & B and LP relaxations). � • Integral parallelization of every component of Branch & Bound. � • Handle large problems: parallel problem data distribution. • Distributed-memory parallelization. • Novel fine-grained load-balancing strategies. ... • Actually two(2) parallel solvers: • PIPS-PSBB � • ug[PIPS-SBB,MPI] ... *PIPS-PSBB: Parallel Interior Point Solver – Parallel Simple Branch and Bound 2

Introduction • MIPs are NP-Hard problems: Theoretically and computationally intractable. • LP-based Branch & Bound allows us to systematically search the solution space by subdividing the problem. • Upper Bounds (UB) are provided by the integer solutions found along the Branch & Bound exploration. Lower Bounds (LB) are provided by the optimal values of the LP relaxations. Upper bound (UB) GAP (%) UB − LB · 100 UB Lower bound (LB) 3

Coarse-grained Parallel Branch and Bound • Branch and bound is straightforward to parallelize: the processing of subproblems is independent. • Standard parallelization present in most state-of-the-art MIP solvers. • Processing of a node becomes the sequential computation bottleneck. • Coarse grained parallelizations are a popular option: Potential performance pitfalls due to a master-slave approach, and relaxations are hard to parallelize.

Coarse-grained Parallel Branch and Bound • Branch and Bound exploration is coordinated by a special process or thread. • Worker threads solve open subproblems using a base MIP solver. • Centralized communication poses serious challenges: performance bottlenecks and a reduction in parallel efficiency: – Communication stress at ramp- up and ramp-down. – Limited rebalancing capability: suboptimal distribution of work. – Diffusion of information is slow.

Currently available coarse-grained parallelizations • Coarse-grained parallelizations may scale poorly. • Extra work is performed when compared to the sequential case. • Information required to fathom nodes is discovered through the optimization. • Powerful heurist stics cs are nece cessa ssary y to find good feasi sible so solutions s early y in the se search ch. 6

Branch and Bound as a graph problem • We can regard parallel Branch and Bound as a parallel graph exploration problem • Given P processors, we define the frontier of a tree as the set of P subproblems currently being open. The subset currently processed in parallel are the active nodes. • We additionally define a redundant node as a subproblem, which is fathomable if the optimal solution is known. • The goal is to increase the efficiency of Parallel Branch and Bound by reducing the number of redundant nodes explored.

Our approach to Parallel Branch and Bound • In order to reduce the amount of redundant nodes explored, the search must fathom subproblems by having high quality primal incumbents and focus on the most promising nodes. • To increase the parallel efficiency by: – Generating a set of active nodes comprised of the most promising nodes. – Employing processors to explore the smallest amount of active nodes. • Two degrees of parallelism: – Processing of nodes in parallel (parallel LP relaxation, parallel heuristics, parallel problem branching, …). – Branch and Bound in parallel. � ...

Fine-grained Parallel Branch and Bound • The smallest transferrable unit of work is a Branch and Bound node. • Because of the exchange of nodes, queues in processors become a collection of subtrees. • This allows for great flexibility and a fine-grained control of the parallel effort. • Coordination of the parallel optimization is decentralized with the objective of maximizing load balance.

All-to-all parallel node exchange • Load balancing is maintained via Solver 0 Solver 1 Solver 2 synchronous MPI collective 0 1 2 3 4 7 9 10 11 12 13 communications. 5 6 8 15 14 17 18 19 16 20 21 22 • The lower bound of the most promising K nodes of every processor are exchanged Gather top K · N · N bounds (K nodes · N solvers · N solvers) K=3, N=3 and ranked. 0 0 1 1 2 2 3 3 5 5 6 6 8 8 15 15 16 16 4 4 7 7 9 9 10 10 0 0 1 1 2 2 3 3 5 5 6 6 8 8 15 15 16 16 4 4 7 7 9 9 10 10 0 0 1 1 2 2 3 3 5 5 6 6 8 8 15 15 16 16 4 4 7 7 9 9 10 10 • The top K out of K ·N nodes are selected 11 11 12 12 13 13 14 14 17 17 18 18 19 19 Solver 0 Solver 2 11 11 12 12 13 13 14 14 17 17 18 18 19 19 Solver 0 Solver 1 and redistributed in a round robin fashion. 11 11 12 12 13 13 14 14 17 17 18 18 19 19 Solver 0 Solver 0 Sort, and select top K · N bounds • Because of the synchronous nature of the Solver 2 Solver 1 0 1 2 3 4 5 6 7 8 approach, communication must be used Solver 0 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 strategically in order to avoid parallel overheads. Redistribution of top K · N nodes • Node transfers are synchronous, while the Solver 2 Solver 0 Solver 1 statuses of each solver (Upper/lower 0 3 6 15 1 4 7 9 2 5 8 10 bounds, tree sizes, times, solutions, …) 20 21 22 11 12 13 14 16 are exchanged asynchronously. 17 18 19 Node estimation/bound Node information n n

Stochastic Mixed Integer Programming: an overview • Stochastic programming models optimization problems involving uncertainty. • We consider two-stage stochastic mixed-integer programs (SMIPs) with recourse: – 1st stage: deterministic “now” decisions – 2nd stage: depends on random event & first stage decisions. • Cost function includes deterministic variables & expected value function of non-deterministic parameters

Stochastic MIPs and their deterministic equivalent • We consider deterministic equivalent formulations of 2-stage SMIPs under the sample average approximation • This assumption yields characteristic dual block-angular structure. min c t x A s.t. Common constraints T 1 W 1       A x 0 b 0 } T 1 W 1 x 1 b 1 Independent       T 2 W 2 ≤       T 2 W 2 x 2 b 2 realization             . . . . . scenarios ... .  .   .   .  ... . . . T N W N x N b N T N W N

PIPS-PSBB: Design philosophy and features • PIPS-PSBB is a specialized solver for two-stage Stochastic Mixed Integer Programs that uses Branch and Bound to achieve finite convergence to optimality. • It addresses each of the the issues associated to Stochastic MIPs: – A Distributed Memory approach allows to partition the second stage scenario data among multiple computing nodes. A T 1 W 1 T 2 W 2 . . . ... ... T N W N

PIPS-SBB: Design philosophy and features • PIPS-SBB is a specialized solver for two-stage Stochastic Mixed Integer Programs that uses Branch and Bound to achieve finite convergence to optimality. • It addresses each of the the issues associated to Stochastic MIPs: – A Distributed Memory approach allows to partition the second stage scenario data among multiple computing nodes. – As the backbone LP solver, we use PIPS-S: a Distributed Memory parallel Simplex solver for Stochastic Linear Programs. � ...

PIPS-SBB: Design philosophy and features • PIPS-SBB is a specialized solver for two-stage Stochastic Mixed Integer Programs that uses Branch and Bound to achieve finite convergence to optimality. • It addresses each of the the issues associated to Stochastic MIPs: – A Distributed Memory approach allows to partition the second stage scenario data among multiple computing nodes. – As the backbone LP solver, we use PIPS-S: a Distributed Memory parallel Simplex solver for Stochastic Linear Programs. – PIPS-PSBB has a structured software architecture that is easy to expand in terms of functionality and features.

Our approach to Parallel Branch and Bound • Two levels of parallelism require a layered organization of the MPI processors. • In the Branch and bound communicator, processors exchange: … – Branch and Bound Nodes. P0,0 P0,1 P0,2 P0,n PIPS-SBB Solver 0 – Solutions. – Lower Bound Information. … P1,0 P1,1 P1,2 P1,n PIPS-SBB Solver 1 – Queue sizes and search status. • In the PIPS-S communicator, processors perform in parallel: … … … … – LP relaxations. … – Primal Heuristics. Pm,0 Pm,1 Pm,2 Pm,n PIPS-SBB Solver m – Branching and candidate selection. • Strategies for ramp-up: Branch Branch Branch Branch and and and and – Parallel Strong Branching Bound Bound Bound Bound Comm 0 Comm 1 Comm 2 Comm n – Standard Branch and Bound • Strategy for Ramp-down: intensify the frequency of node rebalancing.

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, - PowerPoint PPT Presentation

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Ll Llu us-Mi Miquel Mu Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic Mixed-Integer programs

PIPs: problems and solutions Heike Rabe Gene Dempsey Jenny Walsh Stephanie Laer Mark Turner

matching Renier van Gelooven 19-20 September 2019 Objectives SBB SBB is the Foundation for

Deep Learning of Railway Track Faults using GPUs Defect Detection. Swiss Federal Railways (SBB)

O P RACTICAL I NTEGRATED P ATIENT S UPPORT G I D (PIPS) K Dwight Odland Member, Board of

Agreed PIPs for FDCs for the treatment of HIV-1 infection Introduction for the HIV FDC Expert

One PIP or multiple PIPs? EMA policy on changes in scope of PIP Decisions Presented by: Magda

PIPs guide to COVID-19 for early years children Based on an original idea by Maneula Molina

Process Improvement Proposals (PIPs) Organization, Team, Individual AIS Experience Report TSP

PIPS and your career options Overview Why do an internship? What are my options?

Formulations: PIPs evaluation-case studies Viewpoint from the EMA - Quality team EMA/EFPIA Info

PIPS Is not (just) Polyhedral Software Mehdi A MINI 1 , 2 Corinne A NCOURT 2 Fabien C OELHO 2

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

Parallel & Distributed Real-Time Systems Lecture #12 Professor Jan Jonsson Department of

Partitioned Successive-Cancellation List Decoding of Polar Codes Seyyed Ali Hashemi , Alexios

Quantum Bounds, Estimation, and Metrology limits and possibilities offered by the theory

Finite-Length Analysis of Irregular Expurgated LDPC Codes under Finite Number of Iterations

MPI Internals Advanced Parallel Programming Stephen Booth David Henty EPCC Dan Holmes

Numerically Stable Binary Gradient Coding Neophytos Charalambides Hessam Mahdavifar Alfred Hero

AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1 October 4 th , 2012 1 Department of Computer

Interaction Testing Chapter 15 Interaction faults and failures Subtle Difficult to detect

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, - PowerPoint PPT Presentation

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Ll Llu us-Mi Miquel Mu Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic Mixed-Integer programs

PIPs: problems and solutions Heike Rabe Gene Dempsey Jenny Walsh Stephanie Laer Mark Turner

matching Renier van Gelooven 19-20 September 2019 Objectives SBB SBB is the Foundation for

Deep Learning of Railway Track Faults using GPUs Defect Detection. Swiss Federal Railways (SBB)

O P RACTICAL I NTEGRATED P ATIENT S UPPORT G I D (PIPS) K Dwight Odland Member, Board of

Agreed PIPs for FDCs for the treatment of HIV-1 infection Introduction for the HIV FDC Expert

One PIP or multiple PIPs? EMA policy on changes in scope of PIP Decisions Presented by: Magda

PIPs guide to COVID-19 for early years children Based on an original idea by Maneula Molina

Process Improvement Proposals (PIPs) Organization, Team, Individual AIS Experience Report TSP

PIPS and your career options Overview Why do an internship? What are my options?

Formulations: PIPs evaluation-case studies Viewpoint from the EMA - Quality team EMA/EFPIA Info

PIPS Is not (just) Polyhedral Software Mehdi A MINI 1 , 2 Corinne A NCOURT 2 Fabien C OELHO 2

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

Parallel &amp; Distributed Real-Time Systems Lecture #12 Professor Jan Jonsson Department of

Partitioned Successive-Cancellation List Decoding of Polar Codes Seyyed Ali Hashemi , Alexios

Quantum Bounds, Estimation, and Metrology limits and possibilities offered by the theory

Finite-Length Analysis of Irregular Expurgated LDPC Codes under Finite Number of Iterations

MPI Internals Advanced Parallel Programming Stephen Booth David Henty EPCC Dan Holmes

Numerically Stable Binary Gradient Coding Neophytos Charalambides Hessam Mahdavifar Alfred Hero

AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1 October 4 th , 2012 1 Department of Computer

Interaction Testing Chapter 15 Interaction faults and failures Subtle Difficult to detect

Parallel & Distributed Real-Time Systems Lecture #12 Professor Jan Jonsson Department of