libraries for large scale quadratic programming M. ermk, V. Hapla, - - PowerPoint PPT Presentation

libraries for large scale quadratic programming
SMART_READER_LITE
LIVE PREVIEW

libraries for large scale quadratic programming M. ermk, V. Hapla, - - PowerPoint PPT Presentation

libraries for large scale quadratic programming M. ermk, V. Hapla, D. Hork, J. Kruk, L. Pospil*, R. Sojka, A. Vaatov + prof. Z. Dostl PETSc User Meeting 2016, Vienna, June 28-30, 2016 * Universit dellaSvizzera italiana,


slide-1
SLIDE 1

.it4i.cz

  • M. Čermák, V. Hapla, D. Horák, J. Kružík, L. Pospíšil*,
  • R. Sojka, A. Vašatová + prof. Z. Dostál

libraries for large scale quadratic programming

PETSc User Meeting 2016, Vienna, June 28-30, 2016

* Università dellaSvizzera italiana, Switzerland

slide-2
SLIDE 2

.it4i.cz

IT4Innovations

Prague Brno Ostrava Anselm - 94 TFLOPs system

  • perational since June 2013

Salomon – 2 PFLOPs system from June/July 2015 2013 2014 2015 Vienna Czechia

  • HPC infrastructure
  • 8 research programmes
slide-3
SLIDE 3

.it4i.cz

Salomon in numbers

  • 1008 compute nodes
  • 24,192 Intel Haswell cores

§ 2 x 12 per node

  • 129,024 GB RAM

§ 128GB per node, 5.3GB per core

  • 864 2nd generation MICs (KNC)

§ Intel Xeon Phi 7120P § 52,704 cores

  • Rpeak 2.01 Pflop/s
  • Rmax 1.46 Pflop/s (LINPACK)

Iargest European installation of MICs ⇒ interested in PETSc on MIC

slide-4
SLIDE 4

.it4i.cz

in a nutshell

Parallel, Efficient, Robust, Modular, Object-oriented, Numerical http://permon.it4i.cz/

  • PermonQP

§ generic QP solution framework § QP problems, transforms, solvers

  • PermonFLLOP

§ FETI DDM implementation § extends PermonQP

  • both

§ based on/extendingPETSc open-source frameworkfor sci. comp. § together solve large contact mechanics problems

  • started in 2011, ~30 000 lines of effective C code
slide-5
SLIDE 5

.it4i.cz

PETSc + PERMON

Linear solvers (KSP) Preconditioners (PC) Nonlinear solvers (SNES) Matrices (Mat) Vectors (Vec) Index Sets (IS) PermonQP solvers (QPS) PermonQP problems (QP) PermonFLLOP TAO PCBDDC PCPERMON?

PERMON = external PETSc extension much like SLEPc

slide-6
SLIDE 6

.it4i.cz

Build

  • easy
  • just build PETSc
  • set PETSC_DIR and PETSC_ARCH
  • set PERMON_DIR
  • make
slide-7
SLIDE 7

.it4i.cz

  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1
0.1 0.2 0.3 0.4 0.5
  • 0.5
0.5
  • 2
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 2 x 10
  • 5
  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1
0.1 0.2 0.3 0.4 0.5
  • 0.5
0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1
0.1 0.2 0.3 0.4 0.5
  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5

  • 0.5

0.5

  • 0.1

0.1 0.2 0.3 0.4 0.5 0.6

PermonFLLOP

  • implements FETI

non-overlappingDDM

  • dual formulation
  • unknown optimal Lagrange

multipliers found by iterative process

  • they enforcecontinuity

across subdomains

  • mix iterative and

direct solvers

slide-8
SLIDE 8

.it4i.cz

FETI: primal formulation

𝐋 𝐂% 𝐂 𝐏 𝐯 𝛍 = 𝐠 𝐩 § K K SPD block-diagonal stiffness matrix (blocks ~ subdomains) § R columns span kernel of K § f f load vector § B signed boolean gluing matrix (can enforce also Dirichlet BC - TFETI)

slide-9
SLIDE 9

.it4i.cz

FETI: QP formulations

§ 𝐋 𝐂% 𝐂 𝐏 𝐯 𝛍 = 𝐠 𝐩 (primal) § 𝐂𝐋0𝐂% −𝐂𝐒 −𝐒%𝐂% 𝐏 𝛍 𝛃 = 𝐂𝐋0𝐠 −𝐒%𝐠 (dual) § min

𝐯 7 8

⁄ 𝐯%𝐋𝐯 − 𝐯%𝐠 s.t. 𝐂𝐯 = 𝐩 (primal) 𝐆 = 𝐂𝐋0𝐂%, 𝐞 = 𝐂𝐋0𝐠, 𝐇 = −𝐒%𝐂%, 𝐟 = −𝐒%𝐠 𝐯 = 𝐋> 𝐠 − 𝐂%𝛍 + 𝐒𝛃 𝛃 = − 𝐇𝐇% @7𝐇(𝐆𝛍 − 𝐞) § min

𝛍 7 8

⁄ 𝛍%𝐆𝛍 − 𝛍%𝐞 s.t. 𝐇𝛍 = 𝐟 (dual)

slide-10
SLIDE 10

.it4i.cz

FETI: improving the dual formulation

1) min

𝛍 7 8

⁄ 𝛍%𝐆𝛍− 𝛍%𝐞 s.t. 𝐇𝛍 = 𝐟 𝛍 = 𝛍 F + 𝛍G 𝛍G = 𝐇% 𝐇𝐇% @7𝐟 ∈ Im𝐇% 𝐞̅ = 𝐞 − 𝐆𝛍G 𝛍 F ∈ Ker𝐇 unknown 2) min

𝛍 M 7 8

⁄ 𝛍 F%𝐆𝛍 F − 𝛍 F% 𝐞̅s.t. 𝐇𝛍 F = 𝐩 𝐐 = 𝐉 − 𝐇% 𝐇𝐇% @7𝐇, [ 𝐇𝐇% @7 is coarse problem] 3) min

𝛍 M 7 8

⁄ 𝛍 F%𝐐𝐆𝐐𝛍 F − 𝛍 F% 𝐐𝐞̅ ⟺ 𝐐𝐆𝐐𝛍 F = 𝐐 [solve with CG]

slide-11
SLIDE 11

.it4i.cz

FETI: inequality constraints

  • primal formulation

min

𝐯 7 8

⁄ 𝐯%𝐋𝐯 − 𝐯%𝐠 s.t. 𝐂Q𝐯 = 𝐩 and 𝐂R𝐯 ≤ 𝐝R

  • dual formulation

min

𝛍 M 7 8

⁄ 𝛍 F%𝐐𝐆𝐐𝛍 F − 𝛍 F% 𝐐𝐞̅ s.t. 𝐇𝛍 F = 𝐩 and 𝛍 F ≥ 𝒎 (is no more a linear system !)

  • use SMALXE + MPGP (discussed later)

to solve this dual formulation

(nonpenetration -mortar conditions)

slide-12
SLIDE 12

.it4i.cz

F

Total FETI (TFETI)

  • Dirichlet boundary conditions handled same way as gluing
  • all subdomain floating

§ all stiffness matrices singular § no need to detect it § same nullspaces in typical cases § uniform handling

  • additional rows of B matrix
  • effect of richer coarse space?
slide-13
SLIDE 13

.it4i.cz

PetscErrorCode QPFetiPrepare(QP qp) { … QPTDualize(qp); QPTHomogenizeEq(qp); QPTEnforceEqByProjector(qp); … } MPI_Comm comm; Mat K, BE, BI; Věc f, cE, cI, x; QP qp; QPCreate(comm, &qp); QPSetOperator(qp, K); QPSetRhs(qp, f); QPSetEq(qp, BE, cE); QPSetIneq(qp, BI, cI); QPFetiPrepare(qp); QPSCreate(comm, &qps); QPSSetQP(qps, qp); QPSSetFromOptions(qps); QPSSolve(qps); QPGetSolutionVector(qp,&x);

PermonFLLOP calls PermonQP

Sequence of QP transforms

slide-14
SLIDE 14

.it4i.cz

PermonQP

  • QP solution framework
  • separation of concerns
  • QP problems (QP)
  • QP transforms (QPT)

and corresponding backward reconstructions

  • QP solvers (QPS)
  • “gradually get rid of difficulties in the original formulation”
  • modularity/composability
  • automatic/manual choice of solver
  • applications: contact problems, data fitting, support vector

machines, ...

QP0

parent child parent child

QP1 QP2

reconstruct. function

injects

reconstruct. function

injects

solver user

slide-15
SLIDE 15

.it4i.cz

PermonQP and PETSc TAO

TAO

  • general objective function
  • only unconstrained and box constrained

PermonQP

  • specific objective function (convex quadratic), Mat, Vec
  • more like KSP, less like SNES
  • can be coupled with FETI DDM (PermonFLLOP)
  • more general constraints

§ linear equality § linear inequality § separable convex constraints – next slide

  • applicable to any subset of indices (IS)
  • simple bound / box constraints
  • quadratic constraints (spherical, elliptical)
  • conical constraints, …
slide-16
SLIDE 16

.it4i.cz

Generalizing box constraints - QPC

QPC index set QPCBOUND lower bound 𝐲 ≥ 𝒎 QPCELLIPTIC major axis minor axis QPCCONICAL angles QPCQUADRATIC radii QPCBOX lower bound upper bound 𝐦 ≤ 𝐲 ≤ 𝐯

  • separable convex constraints
  • applicable to any subset of indices (IS)
  • simple bound / box constraints
  • quadratic constraints (spherical, elliptical)
  • conical constraints, …

min 1 2 [ 𝐲%𝐁𝐲 − 𝐲% 𝐜 s.t. …

slide-17
SLIDE 17

.it4i.cz

PermonQP – print KKT satisfaction

===================== QPChainViewKKT output follows QP Object: 4 MPI processes type: QP #0 in chain, derived by || x|| = 4.75e+00 max( x) = 2.00e-06 = x(132) min( x) = -3.23e-01 = x(483) || b|| = 4.75e-02 max( b) = -4.17e-04 = b(10) min( b) = -2.50e-03 = b(423) ||cE|| = 0.00e-00 max(cE) = 0.00e-00 = cE(0) min(cE) = 0.00e-00 = cE(0) ||cI|| = 1.63e+00 max(cI) = 4.34e-01 = cI(0) min(cI) = 3.01e-01 = cI(9) r = ||A*x - b + BE'*lambda_E + BI'*lambda_I|| = 6.78e-06 rO/||b|| = 1.43e-04 r = ||BE*x-cE|| = 1.58e-05 r/||b|| = 3.32e-04 r = ||max(BI*x-cI,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = ||min(lambda_I,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = lambda_I'*(BI*x-cI) = -1.39e-07 r/||b|| = -2.92e-06

  • QP Object: 4 MPI processes

type: QP #1 in chain, derived by QPTScale || x|| = 4.75e+00 max( x) = 2.00e-06 = x(132) min( x) = -3.23e-01 = x(483) || b|| = 4.75e-02 max( b) = -4.17e-04 = b(10) min( b) = -2.50e-03 = b(423) ||cE|| = 0.00e-00 max(cE) = 0.00e-00 = cE(0) min(cE) = 0.00e-00 = cE(0) ||cI|| = 1.63e+00 max(cI) = 4.34e-01 = cI(0) min(cI) = 3.01e-01 = cI(9) r = ||A*x - b + BE'*lambda_E + BI'*lambda_I|| = 6.78e-06 rO/||b|| = 1.43e-04 r = ||BE*x-cE|| = 1.58e-05 r/||b|| = 3.32e-04 r = ||max(BI*x-cI,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = ||min(lambda_I,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = lambda_I'*(BI*x-cI) = -1.39e-07 r/||b|| = -2.92e-06

  • QP Object:(dual_) 4 MPI processes

type: QP #2 in chain, derived by QPTDualize || x|| = 4.47e-01 max( x) = 1.30e-01 = x(112) min( x) = -1.47e-01 = x(106) || b|| = 2.62e+00 max( b) = 3.90e-01 = b(21) min( b) = -5.61e-01 = b(133) ||cE|| = 5.00e-01 max(cE) = -2.50e-01 = cE(3) min(cE) = -2.50e-01 = cE(0) ||lb|| = inf max(lb) = 0.00e+00 = lb(30) min(lb) = -4.49e+307 = lb(0) r = ||A*x - b + BE'*lambda_E - lambda_lb|| = 5.31e-05 rO/||b|| = 2.03e-05 r = ||BE*x-cE|| = 6.78e-06 r/||b|| = 2.59e-06 r = ||min(x-lb,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = ||min(lambda_lb,0)|| = 1.67e-05 r/||b|| = 6.37e-06 r = |lambda_lb'*(lb-x)| = 2.47e-06 r/||b|| = 9.42e-07

  • QP Object:(dual_) 4 MPI processes

type: QP #2 in chain, derived by QPTDualize || x|| = 4.47e-01 max( x) = 1.30e-01 = x(112) min( x) = -1.47e-01 = x(106) || b|| = 2.62e+00 max( b) = 3.90e-01 = b(21) min( b) = -5.61e-01 = b(133) ||cE|| = 5.00e-01 max(cE) = -2.50e-01 = cE(3) min(cE) = -2.50e-01 = cE(0) ||lb|| = inf max(lb) = 0.00e+00 = lb(30) min(lb) = -4.49e+307 = lb(0) r = ||A*x - b + BE'*lambda_E - lambda_lb|| = 5.31e-05 rO/||b|| = 2.03e-05 r = ||BE*x-cE|| = 6.78e-06 r/||b|| = 2.59e-06 r = ||min(x-lb,0)|| = 0.00e+00 r/||b|| = 0.00e+00 r = ||min(lambda_lb,0)|| = 1.67e-05 r/||b|| = 6.37e-06 r = |lambda_lb'*(lb-x)| = 2.47e-06 r/||b|| = 9.42e-07

min 7 8 ⁄ 𝐲%𝐁𝐲 − 𝐲% 𝐜 s.t. 𝐂Q𝐲 = 𝐝Q and 𝐲 ≥ 𝒎 𝐁𝐲 − 𝐜 + 𝐂Q

_𝛍Q − 𝛍` = 𝐩

𝐂Q𝐲 = 𝒅Q 𝐲 ≥ 𝒎 𝛍` ≥ 𝐩 𝛍` 𝒎 − 𝐲 = 𝐩

  • qp_chain_view_kkt
slide-18
SLIDE 18

.it4i.cz

PermonQP – how to solve any QP?

  • Dualization

§ crucial QP transform § new QP with smaller dim, better conditioned and simpler constraints

  • SMALXE

§ "pass-through" solver taking care of equality constraints

  • moved to the Hessian matrix by means of penalty term

§

  • aux. problem with the rest of constraints solved by inner solver
  • concrete solvers for convex QP

§ unconstrained – PETSc KSP wrapper (QPSKSP) § bound/box constraints § MPGP, APGD, PBBf, SPG-QP, ... § PETSc TAO wrapper

min 7 8 ⁄ 𝐯%𝐋𝐯 − 𝐯%𝐠 s.t. 𝐂Q𝐯 = 𝐩 and 𝐂R𝐯 ≤ 𝐝R min 7 8 ⁄ 𝛍%𝐁𝛍 − 𝛍% 𝐜 s.t. 𝐃𝛍 = 𝐞 and 𝛍 ≥ 𝒎 min 7 8 ⁄ 𝛍%𝐁𝛍 − 𝛍% 𝐜 s.t. 𝐃𝛍 = 𝐞 and 𝛍 ≥ 𝒎 min 7 8 ⁄ 𝛍%𝐁𝛍 − 𝛍% 𝐜 s.t. 𝛍 ≥ 𝒎

slide-19
SLIDE 19

.it4i.cz

SMALXE for min

𝐂𝐲 c 𝐩

d e 𝐲_𝐁𝐲 − 𝐲_𝐜 (SMALE)

  • Mainstream: solve KKT system 𝐁

𝐂_ 𝐂 𝐏 𝐲 𝛎 = 𝐜 𝐩

§ using direct/iterative linear system solver § 𝛎 is equality constraint Lagrange multiplier § inequality constrained → slacks, nonlin. eq. → Interior Point Method

  • SMALXE: solve directly QP min d

e 𝐲_𝐁𝐲 − 𝐲_𝐜 s. t. 𝐂𝐲 = 𝐩

§ using Augmented Lagrangians and inner iterativesolver § easier to extend for inequality constraints § equality and inequality constraints handled completely separately,

  • ne after another, in a specialized manner

§ matrix structure agnostic - both A and B may be even implicit

slide-20
SLIDE 20

.it4i.cz

𝑀 𝐲, 𝛎 = d

e 𝐲_𝐁𝐲 − 𝐲_𝐜 + 𝛎_𝐂𝐲

(Lagrangian) 𝐡 𝐲, 𝛎 = 𝐁𝐲 − 𝐜 + 𝐂_𝛎 (Lagrangian gradient)

Lagrangian

slide-21
SLIDE 21

.it4i.cz

𝑀 j 𝐲, 𝛎, 𝜍 = d

e 𝐲_𝐁𝐲 − 𝐲_𝐜 + 𝛎_𝐂𝐲 + d e 𝜍 𝐂𝐲 8

(augmented Lagrangian) 𝐡 l 𝐲, 𝛎,𝜍 = 𝐁𝐲 − 𝐜 + 𝐂_𝛎 + 𝜍𝐂_𝐂𝐲 (augmented Lagrangian gradient) Note1: 𝑀 j is Lagrangian of penalized problem min

𝐂𝐲 c 𝐩

d e 𝐲_(𝐁 + 𝜍𝐂_𝐂)𝐲 − 𝐲_𝐜

(QPP) 𝑀 j 𝐲, 𝛎, 𝜍 = d

e 𝐲_(𝐁 + 𝜍𝐂_𝐂)𝐲 − 𝐲_(𝐜 + 𝐂𝛎)

Note2: Solution 𝐲 m of unconstrained penalized problem min d

e 𝐲_(𝐁 + 𝜍𝐂_𝐂)𝐲 − 𝐲_𝐜

(QPPU) approximates that of original QP: 𝐡 l ≤ 𝜁 𝐜 ⟹ 𝐂𝐲 ≤

70p qrst 𝐁 u

  • Augmented Lagrangian

Dostál, Z.: Optimal Quadratic Programming Algorithms, Springer Optimization and Its Applications, 2009, ISBN 978-0387848051

slide-22
SLIDE 22

.it4i.cz

initialize: 𝐲w 𝐩 , 𝜍w > 0 2 𝐁 𝛎w = 𝐩, 𝑙 = 0 while 𝐡 l 𝐲{, 𝛎{,𝜍{ > 𝜁 𝐜 ∨ 𝐂𝐲{ > 𝜁 𝐜 𝛎{07 = 𝛎{ + 𝜍{𝐂𝐲{ 𝐲{07 = arg min

𝐲

𝑀 j 𝐲, 𝛎{07,𝜍{ starting with 𝐲{ 𝑙 = 𝑙 + 1 end

Exact aug. Lag. method for min

𝐂𝐲 c 𝐩

d e 𝐲_𝐁𝐲 − 𝐲_𝐜

(Method of multipliers)

𝐲{07 = arg min

𝐲 7 8 𝐲_(𝐁 + 𝜍{𝐂_𝐂)𝐲 − 𝐲_(𝐜 + 𝐂𝛎{07)

  • inner QP with penalized Hessian and updated RHS
  • unconstrained QP ~ solving linear system
  • solve with CG

Dostál, Z.: Optimal Quadratic Programming Algorithms, Springer Optimization and Its Applications, 2009, ISBN 978-0387848051

slide-23
SLIDE 23

.it4i.cz

initialize: 𝐲w 𝐩 , 𝛾 > 0 10 , 𝑁w > 0 100 𝐁

, 𝜍w > 0 2 𝐁 ,𝜃 > 0 0.1 𝐜

𝛎w = 𝐩, 𝑙 = 0 while 𝐡 l 𝐲{, 𝛎{,𝜍{ > 𝜁 𝐜 ∨ 𝐂𝐲{ > 𝜁 𝐜 𝛎{07 = 𝛎{ + 𝜍{𝐂𝐲{ find 𝐲{07 such that 𝐡 l 𝐲{07, 𝛎{07,𝜍{ ≤ min 𝑁{ 𝐂𝐲{07 ,𝜃 if 𝑀 j 𝐲{07,𝛎{07, 𝜍{ > 𝑀 j 𝐲{,𝛎{, 𝜍{@7 + d

e 𝜍{ 𝐂𝐲{07 8

𝜍{07 = 𝛾𝜍{ or 𝑁{07 = 𝑁{/𝛾 end 𝑙 = 𝑙 + 1 end

SMALXE for min

𝐂𝐲 c 𝐩

d e 𝐲_𝐁𝐲 − 𝐲_𝐜 (SMALE)

again solving min

𝐲

d e 𝐲_(𝐁 + 𝜍{𝐂_𝐂)𝐲 − 𝐲_(𝐜 + 𝐂𝛎{07)

BUT with early termination

  • inner QP with penalized Hessian and updated RHS
  • unconstrained QP ~ solving linear system
  • solve with e.g. CG

SMALE or SMALE-M

slide-24
SLIDE 24

.it4i.cz

initialize: 𝐲w 𝐩 , 𝛾 > 0 10 , 𝑁w > 0 100 𝐁

, 𝜍w > 0 2 𝐁 ,𝜃 > 0 0.1 𝐜

𝛎w = 𝐩, 𝑙 = 0 while 𝐡 lƒ 𝐲{,𝛎{, 𝜍{ > 𝜁 𝐜 ∨ 𝐂𝐲{ > 𝜁 𝐜 𝛎{07 = 𝛎{ + 𝜍{𝐂𝐲{ find 𝐲{07 ≥ 𝒎 such that 𝐡 lƒ 𝐲{07,𝛎{07, 𝜍{ ≤ min 𝑁{ 𝐂𝐲{07 ,𝜃 if 𝑀 j 𝐲{07,𝛎{07, 𝜍{ > 𝑀 j 𝐲{,𝛎{, 𝜍{@7 + d

e 𝜍{ 𝐂𝐲{07 8

𝜍{07 = 𝛾𝜍{ or 𝑁{07 = 𝑁{/𝛾 end 𝑙 = 𝑙 + 1 end

SMALXE for min

𝐂𝐲 c 𝐩 𝐲 „ 𝒎

d e 𝐲_𝐁𝐲 − 𝐲_𝐜 (SMALBE)

𝐡ƒ … projected gradient – see next slide now solving min

𝐲 „ 𝒎

d e 𝐲_(𝐁 + 𝜍{𝐂_𝐂)𝐲 − 𝐲_(𝐜 + 𝐂𝛎{07)

but using the same 𝑀 j corresponding to min

𝐂𝐲 c 𝐩

d e 𝐲_𝐁𝐲 − 𝐲_𝐜

SMALBE or SMALBE-M

Dostál, Z.: Optimal Quadratic Programming Algorithms, Springer Optimization and Its Applications, 2009, ISBN 978-0387848051

slide-25
SLIDE 25

.it4i.cz

Projected gradient

  • 𝐡

lƒ = 𝑕†

ƒ , 𝐡

l = 𝑕† , 𝐲 = 𝑦† , 𝒎 = 𝑚†

  • M … set of indices of inequality constrained entries of x

𝑕†

ƒ = ‹

𝑕† min(𝑕†,0) for for 𝑦† > 𝑚† ∨ 𝑗 ∉ 𝑁 𝑦† = 𝑚† ∧ 𝑗 ∈ 𝑁

Dostál, Z.: Optimal Quadratic Programming Algorithms, Springer Optimization and Its Applications, 2009, ISBN 978-0387848051

slide-26
SLIDE 26

.it4i.cz

MPGP (Modified Proportioning with Gradient Projection)

Dostál Z., Pospíšil L.: Minimization of the quadratic function with semidefinite Hessian subject to the bound constraints, Computers and Mathematics with Applications, (2015).

CG-step CG-halfstep + expansion step

  • expand active set
  • ptimal step on active set

(proportioning step)

  • reduce active set
slide-27
SLIDE 27

.it4i.cz

MPGP (Modified Proportioning with Gradient Projection)

Dostál Z., Pospíšil L.: Minimization of the quadratic function with semidefinite Hessian subject to the bound constraints, Computers and Mathematics with Applications, (2015).

slide-28
SLIDE 28

.it4i.cz

SMALXE+MPGP iteration progress

Inner stopping criterion: 𝐡 l ƒ ≤ min 𝑁 𝐂𝐲{ ,𝜃 ∧ 𝐡 l ƒ < ε Start to increase penalty 𝜍 by factor of 2 each outer iteration once 𝐡 l ƒ < ε is reached

slide-29
SLIDE 29

.it4i.cz

Experiments – linear elasticity

10 125 000 20 250 000 40 500 000 81 000 000 162 000 000 324 000 000 20 40 60 80 100 120 140 160 180 10 20 30 40 50 60 70 80 90 100 216 432 864 1728 3456 6912

# DOFs (log2 scale) solution time [sec] # Hessian mult # subdomains (log2 scale) # Hessian mult time

PermonCube + PermonFLLOP + PermonQP Machine: Salomon, IT4Innovations, CZ 8000 375,000,000

slide-30
SLIDE 30

.it4i.cz

10 125 000 20 250 000 40 500 000 81 000 000 162 000 000 324 000 000 20 40 60 80 100 120 140 160 180 50 100 150 200 250 300 216 432 864 1728 3456 6912

# DOFs (log2 scale) solution time [sec] # Hessian mult # subdomains (log2 scale) # Hessian mult time

Experiments – contact elasticity

x y z d

PermonCube + PermonFLLOP + PermonQP Machine: Salomon, IT4Innovations, CZ 8000 375,000,000

slide-31
SLIDE 31

.it4i.cz

52,728,000 105,456,000 210,912,000 421,824,000 20 40 60 80 100 120 140 160 180 50 100 150 200 250 300 1000 2000 4000 8000

# DOFs (log2 scale) solution time [sec] # Hessian mult # subdomains (log2 scale) # Hessian mult time

Very recent improvement

52,728,000 105,456,000 210,912,000 421,824,000 20 40 60 80 100 120 140 160 180 50 100 150 200 250 300 1000 2000 4000 8000

# DOFs (log2 scale) solution time [sec] # Hessian mult # subdomains (log2 scale) # Hessian mult time

adaptive length

  • f expansion step

(computed yesterday)

slide-32
SLIDE 32

.it4i.cz

Coercive and semicoercive variational inequalities

(older PERMON version)

slide-33
SLIDE 33

.it4i.cz

Experiments show

§ FETI DDM numerical scalability

  • condition number is O(n) for constant local problems size n

§ parallel scalability of PermonFLLOP’s FETI implementation § SMALXE numerical scalability

  • number of iterations given by Hessian spectrum

§ PermonQP infrastructure does not slow it down

slide-34
SLIDE 34

.it4i.cz

New feature: >1 subdomains per rank

  • 216 MPI processes/cores
  • cost of factorization and one iteration
  • MUMPS
slide-35
SLIDE 35

.it4i.cz

New feature: >1 subdomains per rank

  • 216 MPI processes/cores
  • time of setup phase and solve phase and total time
  • effect of improving condition number
slide-36
SLIDE 36

.it4i.cz

Bleeding edge

  • optimized assembly and storage of mortar B matrices

§ support for non-conforming submeshes – hanging nodes at interfaces

  • stiffness matrix kernel R black-box stochastic generation
  • other applications of QP – do you use some?

§ support vector machines, data fitting, SQP, Proximal Bundle Method,…

  • condensation of local problems

§ improve arithmetical density, accelerators

  • real-world problems using existing FEM software

§ libMesh, FEniCS, Sifel, Elmer, ViennaMesh/Grid tools

  • other QP algorithms – IP, semismooth Newton
  • gitpublishunder FreeBSD (2-clause BSD) license this year

§ merge separable convex constraints (QPC) § finish splitting into PermonQPand PermonFLLOP § basic documentation

slide-37
SLIDE 37

.it4i.cz

Acknowledgements

The EXA2CT European Project (EXascale Algorithms and Advanced Computational Techniques) Účinné metody odhadu životnosti pro

  • becné víceosé namáhání

(Efficient methods for life prediction for general multiaxial fatigue) GA15-18274S National Programme of Sustainability (NPU II) project “IT4Innovations excellence in science” LQ1602 IPCC-IT4I

slide-38
SLIDE 38

.it4i.cz

Additional acknowledgement

“Coffee-driven research” 1st parallel machine @ IT4I *

* It is able to prepare 2 espressos in parallel.

slide-39
SLIDE 39

.it4i.cz

PERMON team @ IT4I is your PETSc partner in Czechia J Thank you for your attention! Do you have any questions? See you also at our poster.