Multiproce cessi sing ng in Athena na I. I. Performance nce - PowerPoint PPT Presentation

Multiproce cessi sing ng in Athena na I. I. Performance nce study of Athena na event and job level parallelism on multi-co core systems. II. Performance nce optimizations ns in Athena naMP. 1

Athena multi i jobs Athena MJ - job level l parall lleli lism for i in range(4): $> Athena.py - c “ EvtM tMax=25; SkipEv Events ts=$ =$i *25” Jobo.py core-0 JOB 0: start init end Events: [0,1,…,24] core-1 JOB 1: start end init Events: [25,…,49] core-2 JOB 2: start init end Events: [50,…,74] core-3 JOB 3: start init end Events: [75,…,99] PARALLEL: independent jobs LBL-ATLAS-Computing, 2010 2

Athena naMP - event level parallelism sm $> Athena.py -- --nprocs=4 -c EvtM tMax=100 Jobo.py core-0 output- WORK RKER R 0: tmp Maximize Events: [0, 4, 8,…96] files the shared d memory! core-1 output WORK RKER R 1: tmp firstEvnts Events: [1, 5, 9,…,97] files init end OS-fork merge core-2 Output WORK RKER R 2: tmp Events: [2, 6, 10,…,98] files Inpu put Outpu tput t Files core-3 Output WORK RKER R 3: Files tmp Events: [3, 7, 11,…,99] files SERIAL: PARALLEL: workers event loop SERIAL: parent-merge and finalize parent-init-fork AthenaMP Status by S.Binet - http://indico.cern.ch/getFile.py/access?contribId=2&resId=0&materialId=slides&confId=92059 LBL-ATLAS-Computing, 2010 3

Memory footpr print t of Athen enaMP MP & & Athen enaMJ MJ Athen enaMP ~0. 0.5 5 Gb Gb physical memory ry saved ved per r pro roces ess 4

Event throughp ughput ut of Athena naMP and Athena naMJ Hit the memory limit, swapping Athen enaMP Athen enaMJ 5

1. External Optimizations: (no touching complex Athena code)  Hardware Optimizations: HT, QPI, NUMA, Affinity  OS optimizations: affinity, numactl, io-related, disks, virtual machines, etc.  Compiler, Malloc, etc. 2. Gains from AthenaMP/Athena design improvements:  Shared memory, forking later after init  Queue event distribution endless ground for improvements :) 6

Archi hitectur ure upgrades Intel Nehalem Intel sub-Nehalem coors.lbl.gov, rainier.lbl.gov most of LXPLUS machines: Voatlas91,lxplus250,lxplus251 CPU-Memory symmetric access • Hyper Threading ->two logical cores on physical one • QPI Quick Path from CPU to CPU and CPU-to-Memory • Turbo Boost -> dynamic change of CPU-frequency • CPU-Memory non-symmetric access (NUMA) 7

Event t Through ghput t per process for RDO to ESD reco on differe rent t machines 8

Gain from Hyper er-Threa eadi ding AthenaMP Athena MJ 9

Setti ting g affin init ity y of workers to cpu-cores Affinity: pinning each processes to a separate CPU-core Floating: each process scheduled by OS; core switching is frequent 10

Event workers through ghput Workers floating Workers pinned to cpu-cores 11

Rece cent Progress: s: Event distribution using Queue… Lost evt order core-0 events = multiprocesssing.queue(EvtMax+ncpus) WORK RKER R 0: events = [0,1,2,3,4,…,99, None,None,None,None] Events: [0, 4, 5,…] … core-1 WORK RKER R 1: Events: [1, 6, 9,…] evt_loop(evt=events.get(); evt != None): evt_loop_mgr.seek (evt_nbr) evt_loop_mgr.nextEvent () core-2 WORK RKER R 2: Events: [2, 8, 10,…] core-3 WORK RKER R 3: Events: [3, 7, 11,…] Balance e the e arri riva val times es of f work rker ers! Slower worker doesn’t get left behind LBL-ATLAS-Computing, 2010 12

Worke kers through ghput t for Queue Round-robin event Queue event distribution distribution 13

 AthenaMP shares memory about ~0.5 Gb of real memory footprint per worker.  Queue balances workers arrival times thus improving mp-scaling.  Hyper-Threading can give 25-30% gain on events throughput  Affinity settings exploit CPUs better than linux cpu scheduling.  NUMA effects take place on Nehalem CPUs. . 14

1. Externally available performance gains (without touching the athena code)  Architectural gains: HyperThreading, QPI, NUMA etc.  OS gains: affinity, numactl, io-related, disks, virtual machines, etc.  Compiler, Malloc, etc. 2. Gains from Athena/AthenaMP design improvements:  Faster initialization…  Faster distribution of events to workers...  Faster merging: merging events processed by workers instantly by one writer on a fly, without waiting for workers to finish…  Faster finalization… endless ground for improvements :) 15

• Paolo Calafiura, Sebastien Binet, Yushu Yao, Charles Leggett, Wim Lavrijsen • Keith Jackson, David Levinthal • Ian Hinchliffe and LBL ATLAS Group • LBNL and DOE for Funding • CERN for Research LBL-ATLAS-Computing, 2010 16

Multiproce cessi sing ng in Athena na I. I. Performance nce - PowerPoint PPT Presentation

Multiproce cessi sing ng in Athena na I. I. Performance nce study of Athena na event and job level parallelism on multi-co core systems. II. Performance nce optimizations ns in Athena naMP. 1 Athena multi i jobs Athena MJ - job

Mike Chenery Athena SWAN Co-ordinator What is Athena SWAN? A Charter established in 2005

Sarah Dickinson Athena SWAN Manager Athena SWAN Recognition scheme of excellence in womens

Statistical Natural Language Processing Sing DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem case

Athena SWAN Faculty of Health Sciences: Bronze Award Application Dr. Damien Brennan Associate

UPGRADING AN ATHENA SWAN SWAN Champion School of Mathematics & Physics: BRONZE AWARD TO

Athena SWAN Computer Science 25 th of February 2015 Athena SWAN Team members Uwe Aickelin

SING SAFELY PLAN FOR 2020-2021 Collaboration & gratitude Safety guidelines Mask fitting

I to no go the Revisit and Review Sing the Alphabet Can you sing this alphabet song along

Fo ForeGraph: Exp xploring Large-sca scale Graph Proce cessi ssing on on Mul ulti-FP FPGA

Workfo kforce rce Housing sing 1. Beacons Experience 2. Plaza za Robert berto o Maes

Re ta il 2.0 Pre se nte d By: Glo b a l Purc ha sing Co mpa nie s www.g lo b a lpurc ha sing g

Retail ail Adverti tising sing Plan Excelle cellent nt Adverti tising sing Agency ncy

Working for Wellbeing: Cr Crossin sing sec ectors, s, cr crossin sing bo border ers

A case for Learning and A case for Learning and U sing the Standards U sing the Standards Joint

Athena SWAN Charter Annie Ruddlesden Equality Charters Adviser London Mathematics Society

Athena SWAN the new format Peter Clarkson School of Mathematics, Statistics and Actuarial

Truncated Moment Problems with Associated Finite Algebraic Varieties (joint work with Seonguk

Manipulating functional dependencies This document contains detailed descriptions of the

Verifying Pointer Programs using Graph Grammars Christina Jansen, Joost-Pieter Katoen, Christoph

Computing Summaries of String Loops in C for Better Testing and Refactoring Timotej Kapus, Oren

CSCE 790 Computer Systems Security File System Security Professor Qiang Zeng Spring 2020

ECE 242 Data Structures Lecture 22 More Binary Search Trees and Hash Tables October 30, 2009

Weighted lattice point sums in lattice polytopes Paul Gunnels Matthias Beck University of

Academic Writing Harvard Michael Jayawardana (BSc Business [UoL], PgD.CPS [Colombo], PMP, MBA

Multiproce cessi sing ng in Athena na I. I. Performance nce - PowerPoint PPT Presentation

Multiproce cessi sing ng in Athena na I. I. Performance nce study of Athena na event and job level parallelism on multi-co core systems. II. Performance nce optimizations ns in Athena naMP. 1 Athena multi i jobs Athena MJ - job

Mike Chenery Athena SWAN Co-ordinator What is Athena SWAN? A Charter established in 2005

Sarah Dickinson Athena SWAN Manager Athena SWAN Recognition scheme of excellence in womens

Statistical Natural Language Processing Sing DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem case

Athena SWAN Faculty of Health Sciences: Bronze Award Application Dr. Damien Brennan Associate

UPGRADING AN ATHENA SWAN SWAN Champion School of Mathematics &amp; Physics: BRONZE AWARD TO

Athena SWAN Computer Science 25 th of February 2015 Athena SWAN Team members Uwe Aickelin

SING SAFELY PLAN FOR 2020-2021 Collaboration &amp; gratitude Safety guidelines Mask fitting

I to no go the Revisit and Review Sing the Alphabet Can you sing this alphabet song along

Fo ForeGraph: Exp xploring Large-sca scale Graph Proce cessi ssing on on Mul ulti-FP FPGA

Workfo kforce rce Housing sing 1. Beacons Experience 2. Plaza za Robert berto o Maes

Re ta il 2.0 Pre se nte d By: Glo b a l Purc ha sing Co mpa nie s www.g lo b a lpurc ha sing g

Retail ail Adverti tising sing Plan Excelle cellent nt Adverti tising sing Agency ncy

Working for Wellbeing: Cr Crossin sing sec ectors, s, cr crossin sing bo border ers

A case for Learning and A case for Learning and U sing the Standards U sing the Standards Joint

Athena SWAN Charter Annie Ruddlesden Equality Charters Adviser London Mathematics Society

Athena SWAN the new format Peter Clarkson School of Mathematics, Statistics and Actuarial

Truncated Moment Problems with Associated Finite Algebraic Varieties (joint work with Seonguk

Manipulating functional dependencies This document contains detailed descriptions of the

Verifying Pointer Programs using Graph Grammars Christina Jansen, Joost-Pieter Katoen, Christoph

Computing Summaries of String Loops in C for Better Testing and Refactoring Timotej Kapus, Oren

CSCE 790 Computer Systems Security File System Security Professor Qiang Zeng Spring 2020

ECE 242 Data Structures Lecture 22 More Binary Search Trees and Hash Tables October 30, 2009

Weighted lattice point sums in lattice polytopes Paul Gunnels Matthias Beck University of

Academic Writing Harvard Michael Jayawardana (BSc Business [UoL], PgD.CPS [Colombo], PMP, MBA

UPGRADING AN ATHENA SWAN SWAN Champion School of Mathematics & Physics: BRONZE AWARD TO

SING SAFELY PLAN FOR 2020-2021 Collaboration & gratitude Safety guidelines Mask fitting