Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, - PowerPoint PPT Presentation

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, 06 July 2010

Tracking Challenge in CBM (FAIR/GSI, Germany) • Fixed-target heavy-ion experiment • 10 7 Au+Au collisions/s • 1000 charged particles/collision • Non-homogeneous magnetic field • Double-sided strip detectors (85% combinatorial space points) Track reconstruction in STS/MVD and displaced vertex search are required in the first trigger level. Reconstruction packages: • track finding Cellular Automaton (CA) • track fitting Kalman Filter (KF) • vertexing KF Particle 06 July 2010, CERN Ivan Kisel, GSI 2/20

Many-Core HPC: Cores, Threads and SIMD HEP: cope with high data rates ! 2015 Cores and Threads realize the task level of parallelism Process 2010 Thread1 Thread2 … … exe r/w r/w exe exe r/w ... ... 2000 CPU Thread Thread Core Threads Cores Scalar Vector D S S S S Performance SIMD Width Fundamental redesign of traditional approaches to data processing Vectors (SIMD) = data level of parallelism is necessary SIMD = Single Instruction, Multiple Data 06 July 2010, CERN Ivan Kisel, GSI 5/20

Our Experience with Many-Core CPU/GPU Architectures NVIDIA GPU Intel/AMD CPU 512 cores Since 2008 2x4 cores Since 2005 6.5 ms/event (CBM) 63% of the maximal GPU utilization (ALICE) Intel MICA IBM Cell Since 2008 32 cores 1+8 cores Since 2006 Cooperation with Intel (ALICE/CBM) 70% of the maximal Cell performance (CBM) Future systems are heterogeneous 06 July 2010, CERN Ivan Kisel, GSI 6/20

CPU/GPU Programming Frameworks • Intel Ct (C for throughput) • Extension to the C language • Intel CPU/GPU specific • SIMD exploitation for automatic parallelism • NVIDIA CUDA (Compute Unified Device Architecture) • Defines hardware platform • Generic programming • Extension to the C language • Explicit memory management • Programming on thread level • OpenCL (Open Computing Language) • Open standard for generic programming • Extension to the C language • Supposed to work on any hardware • Usage of specific hardware capabilities by extensions • Vector classes (Vc) • Overload of C operators with SIMD/SIMT instructions • Uniform approach to all CPU/GPU families • Uni-Frankfurt/FIAS/GSI Vector classes: Cooperation with the Intel Ct group 06 July 2010, CERN Ivan Kisel, GSI 7/20

Vector Classes (Vc) Vector classes overload scalar C operators with SIMD/SIMT extensions SIMD Scalar c = a+b vc = _mm_add_ps(va,vb) Vector classes: Vc increase the speed by the factor: provide full functionality for all platforms SSE2 – SSE4 4x   support the conditional operators future CPUs 8x   MICA/Larrabee 16x  phi(phi<0)+=360;  NVIDIA Fermi research Vector classes enable easy vectorization of complex algorithms 06 July 2010, CERN Ivan Kisel, GSI 8/20

Kalman Filter Track Fit on Cell Intel P4 10000x faster on each CPU Cell Comp. Phys. Comm. 178 (2008) 374-383 The KF speed was increased by 5 orders of magnitude blade11bc4 @IBM, Böblingen: 2 Cell Broadband Engines with 256 kB Local Store at 2.4 GHz Motivated by, but not restricted to Cell ! 06 July 2010, CERN Ivan Kisel, GSI 10/20

Performance of the KF Track Fit on CPU/GPU Systems Scalabilty 2xCell SPE (16 ) 10.00 Woodcrest ( 2 ) Task Level Parallelism Data Stream Parallelism Clovertown ( 4 ) (100x) (10x) Dunnington ( 6 ) Time/Track, s 1.00 0.10 Threads Cores Cores and Threads SIMD SIMD 0.01 scalar double single -> 2 4 8 32 16 Threads Scalability on different CPU architectures – speed-up 100 GPU CPU Real-time performance on NVIDIA GPU graphic cards Real-time performance on different Intel CPU platforms The Kalman Filter Algorithm performs at ns level CBM Progr. Rep. 2008 06 July 2010, CERN Ivan Kisel, GSI 11/20

CBM Cellular Automaton Track Finder Problem Top view Front view 770 Tracks • Fixed-target heavy-ion experiment • 10 7 Au+Au collisions/s • 1000 charged particles/collision • Non-homogeneous magnetic field • Double-sided strip detectors (85% combinatorial space points) • Full on-line event reconstruction Intel X5550, 2x4 cores at 2.67 GHz Scalability Efficiency Highly efficient reconstruction of 150 central collisions per second 06 July 2010, CERN Ivan Kisel, GSI 12/20

Parallelization is now a Standard in the CBM Reconstruction Algorithm Vector SIMD Multi-Threading NVIDIA CUDA OpenCL Time/PC STS Detector + + + + 6.5 ms Muon Detector + + 1.5 ms TRD Detector + + 1.5 ms RICH Detector 3.0 ms + + Vertexing + 10 μs Future Open Charm Analysis + 10 μs Future User Reco/Digi User Analysis + 2009 + 2010 The CBM reconstruction is at ms level Intel X5550, 2x4 cores at 2.67 GHz 06 July 2010, CERN Ivan Kisel, GSI 13/20

International Tracking Workshop 45 participants from Austria, China, Germany, India, Italy, Norway, Russia, Switzerland, UK and USA 06 July 2010, CERN Ivan Kisel, GSI 14/20

Workshop Program 06 July 2010, CERN Ivan Kisel, GSI 15/20

Software Evolution: Many-Core Barrier Scalar single-core OOP Many-core HPC era 2000 1990 2010 t Consolidate efforts of: • Physicists • Mathematicians • Computer scientists • Developers of parallel languages • Many-core CPU/GPU producers 2000 2010 1990 t Software redesign can be synchronized between the experiments 06 July 2010, CERN Ivan Kisel, GSI 16/20

// Track Reconstruction in CBM and ALICE Collider Cylindrical geometry Forward geometry Fixed-Target ALICE (CERN) CBM (FAIR/GSI) 10 4 collisions/s 10 7 collisions/s Intel CPU 8 cores (CBM Reco Group) NVIDIA GPU 240 cores (ALICE HLT Group) Different experiments have similar reconstruction problems Track reconstruction is the most time consuming part of the event reconstruction, therefore many-core CPU/GPU platforms. Track finding is based in both cases on the Cellular Automaton method, track fitting – on the Kalman Filter method. 06 July 2010, CERN Ivan Kisel, GSI 17/20

Stages of Event Reconstruction: To-Do List Track finding Time Detector dependent consuming!!! • Generalized track finder(s) • Geometry representation • Interfaces • Infrastructure Kalman Filter Track fitting Track model dependent • Kalman Filter • Kalman Smoother • Deterministic Annealing Filter • Gaussian Sum Filter • Field representation Kalman Filter Vertex finding/fitting Detector/geometry independent • 3D Mathematics • Adaptive filters • Functionality • Physics analysis Combinatorics Ring finding (PID) RICH specific • Ring finders 06 July 2010, CERN Ivan Kisel, GSI 18/20

Consolidate Efforts: Common Reconstruction Package GSI: Uni-Frankfurt/FIAS: OpenLab (CERN): Algorithms development Vector classes Many-core optimization Many-core optimization GPU implementation Benchmarking Common HEPHY (Vienna)/Uni-Gjovik: Intel: Kalman Filter track fit Reconstruction Ct implementation Kalman Filter vertex fit Many-core optimization Package Benchmarking CBM (FAIR/GSI) ALICE (CERN) Host Experiments: PANDA (FAIR/GSI) STAR (BNL) Juni 28, 2010, FIAS Ivan Kisel, GSI 19/20

Follow-up Workshop Follow-up Workshop: November 2010 – February 2011 at GSI or CERN or BNL ? 06 July 2010, CERN Ivan Kisel, GSI 20/20

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, - PowerPoint PPT Presentation

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, 06 July 2010 Tracking Challenge in CBM (FAIR/GSI, Germany) Fixed-target heavy-ion experiment 10 7 Au+Au collisions/s 1000 charged particles/collision

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Event Reconstruction Event Reconstruction i in High Energy Physics Experiments in High Energy

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

2D Fan Beam Reconstruction 3D Cone Beam Reconstruction Mario Koerner March 17, 2006 1 2D Fan

Fast Scalable Parallel Comparison Sort Fast, Scalable Parallel Comparison Sort On Hybrid Multicore

Peel & Gladstone Road Reconstruction Public Drop-In Event #2 December 2, 2019 What will I

Ch 11. Event Cognition Seminar on Event Cognition Summary of Event Cognition Event

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Calculus (Math 1A) Lecture 1 Vivek Shende August 23, 2017 Hello and welcome to class! I am

Two Universal String Predictions for Heavy Ion Collisions David Mateos ICREA & University of

Capturing some medium effects in the dilaton to study hadrons in AdS / QCD models Alfredo Vega

Image Analysis System Example: Image Classification System pre feature feature segmentation

Ivan Kisel GSI, Darmstadt MPI, Munich, 16 November 2010 1000 charged particles/collision

Module Testing For CMS FPIX Upgrade and Improving Tracking Performance Using New Algorithms

Reliability and validity of Arabic version of BICAMS: Egyptian dialect Prepared by Nevin M

EU workshop - 7th and 8th RP MIE 2011 Oslo Medtech 30th August, 2011 1 Ojective of the

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, - PowerPoint PPT Presentation

Fast Parallel Event Reconstruction Ivan Kisel GSI, Darmstadt CERN, 06 July 2010 Tracking Challenge in CBM (FAIR/GSI, Germany) Fixed-target heavy-ion experiment 10 7 Au+Au collisions/s 1000 charged particles/collision

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Event Reconstruction Event Reconstruction i in High Energy Physics Experiments in High Energy

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

2D Fan Beam Reconstruction 3D Cone Beam Reconstruction Mario Koerner March 17, 2006 1 2D Fan

Fast Scalable Parallel Comparison Sort Fast, Scalable Parallel Comparison Sort On Hybrid Multicore

Peel &amp; Gladstone Road Reconstruction Public Drop-In Event #2 December 2, 2019 What will I

Ch 11. Event Cognition Seminar on Event Cognition Summary of Event Cognition Event

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Calculus (Math 1A) Lecture 1 Vivek Shende August 23, 2017 Hello and welcome to class! I am

Two Universal String Predictions for Heavy Ion Collisions David Mateos ICREA &amp; University of

Capturing some medium effects in the dilaton to study hadrons in AdS / QCD models Alfredo Vega

Image Analysis System Example: Image Classification System pre feature feature segmentation

Ivan Kisel GSI, Darmstadt MPI, Munich, 16 November 2010 1000 charged particles/collision

Module Testing For CMS FPIX Upgrade and Improving Tracking Performance Using New Algorithms

Reliability and validity of Arabic version of BICAMS: Egyptian dialect Prepared by Nevin M

EU workshop - 7th and 8th RP MIE 2011 Oslo Medtech 30th August, 2011 1 Ojective of the

Peel & Gladstone Road Reconstruction Public Drop-In Event #2 December 2, 2019 What will I

Two Universal String Predictions for Heavy Ion Collisions David Mateos ICREA & University of