Today's World-wide Today's World-wide Computing Grid for the - PowerPoint PPT Presentation

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing Grid for the Computing Grid for the Lar Large Hadron Collider g e Hadron Collider (WLCG): (WLCG): A A P t A A Petasca P t ascale Fac l F acilit ility - ilit ilit Moving to Moving to Exascale Moving to Moving to Exascale Exascale? Exascale? Sverre Jarp, CERN openlab CTO 18 May 2011 y

Agenda Agenda  Q i k  Quick overview of CERN and the i f CERN d th Large Hadron Collider  Computing by the LHC experiments  CERN openlab and future R&D  CERN openlab and future R&D  Conclusions 2

3 CERN and LHC CERN and LHC

What is What is CERN? CERN? • CERN is the world's largest particle physics centre CERN is also: • Particle physics is about: - 2250 staff - elementary particles, the constituents all (physicists, engineers, matter in the Universe is made of technicians technicians, …) ) - fundamental forces which hold matter - Some 10’000 visiting together scientists (most of • Particles physics requires: the world's particle - special tools to create and study new particles physicists) - Accelerators They come from -Particle Detectors 500 universities representing representing -Powerful computer systems f 80 nationalities. 4 Intel

What is What is the LHC? the LHC? • The Large Hadron Collider can collide beams of Four experiments, protons at a design energy of 2 * 7 TeV with detectors as ‘big as • Inaugurated Sept. 2008; restart Nov. 2009 Inaugurated Sept. 2008; restart Nov. 2009 cathedrals’: • Reached 3.5 TeV (March 2010) ALICE • 2011/12: Two years at 3.5 TeV before ATLAS upgrade pg CMS LHCb • Using the latest super-conducting technologies, it operates at about – 271 º C, just above the temperature of absolute zero. The coldest place in the Universe. • With its 27 km circumference, the accelerator is the largest superconducting installation in the world. 27 March 2006 5

Collisions at LHC 6

ATLAS ATLAS  General purpose LHC detector – 7000 tons 7

ATLAS under construction (2005) ATLAS under construction (2005)  Picture taken in 2005: Picture taken in 2005:

Compact Muon Compact Muon Solenoid olenoid (CMS – ( CMS – 12500 tons 12500 tons) ) 9

10 eV CMS event @ 3.5 TeV CMS event @ 3.5

11 A CMS collision A CMS collision

12 LHC Computing LHC Computing

Data Handling and Computation for Data Handling and Computation for Physics Analysis Physics Analysis Online Online Online trigger detector Selection & and filtering reconstruction Offline Reconstruction Processed Event Event data summary data 10% Raw 100% 100% Batch Batch data data physics Event analysis 1% reprocessing Offline Analysis w/ROOT Event Analysis objects simulation ( (extracted by physics topic) ) Offline Simulation w/GEANT4 /GEANT4 Interactive physics analysis 13

HEP programming paradigm HEP programming paradigm  All events are independent  Trivial parallelism has been exploited by T i i l ll li h b l it d b High Energy Physics for decades  Compute one event after the other in a single  Compute one event after the other in a single process  Advantage:  Large jobs can be split into N efficient processes, each responsible for processing M events • Built-in scalability  Disadvantage:  Memory needed by each process • With 2 – 4 GB per process • A dual-socket server with Octa-core processors – Needs 32 – 64GB

Rationale for Grids  The LHC Computing requirements are simply too huge for a single site: huge for a single site:  Impractical to build such a huge facility in one place  Modern wide-area networks have made distances shrink • But, latency still has to be kept in mind  The users are not necessarily at CERN  Political resistance to funding everything at CERN  P liti l i t t f di thi t CERN  So, we are spreading the burden! CERN CERN 12% 18% A ll Tier-2s CERN 33% 33% 34% 34% A ll Tier-2s 43% A ll Tier-1s 66% A A ll Tier 1s llTier-1s A A ll Tier-1s llTi 1 39% 55% CPU Disk Tape 15

World-wide LHC Computing Grid World-wide LHC Computing Grid  W-LCG: Largest Grid service in the world •Built on t Built on top p of of EGEE and EGEE and OSG OSG • Almos Almost 1 160 60 sit sites in 34 s in 34 countries countries • More than More than 250’000 250’000 IA 250’000 250’000 IA IA IA processor cores processor cores (w/Lin (w/Linux) ux) • One hundre One hundred d pet petabytes of f st storage st storage 16

Excellent 10 Gb W-LCG connectivity Excellent 10 Gb W-LCG connectivity T2 Tier-2 and Tier-1 sites are inter-connected by T2 the general purpose research networks T2 Germany France Canada T2 T2 USA T2 Any Tier-2 may Taiwan access data at any Tier-1 any Tier 1 T2 T2 USA Nordic T2 United Kingdom Italy T2 Netherlands Spain T2 17 T2 17 T2

First year of LHC data (Tier0 and Grid)  Impressive numbers, we believe! Stored ~ 15 PB in 2010 Writing up to 70 TB / day to tape Writing up to 70 TB / day to tape (~ 70 tapes per day) Data written to tape (GB/day) D t itt t t (GB/d ) 4.50E+07 4 50E+07 Jobs run / month 4.00E+07 1 M jobs/day 3.50E+07 LHCb 3.00E+07 CMS 2.50E+07 2 50E 07 ATLAS ALICE 2.00E+07 1.50E+07 1.00E+07 5.00E+06 0.00E+00 Jan ‐ 08 Feb ‐ 08 Mar ‐ 08 Apr ‐ 08 May ‐ 08 Jun ‐ 08 Jul ‐ 08 Aug ‐ 08 Sep ‐ 08 Oct ‐ 08 Nov ‐ 08 Dec ‐ 08 Jan ‐ 09 Feb ‐ 09 Mar ‐ 09 Apr ‐ 09 May ‐ 09 Jun ‐ 09 Jul ‐ 09 Aug ‐ 09 Sep ‐ 09 Oct ‐ 09 Nov ‐ 09 Dec ‐ 09 Jan ‐ 10 Feb ‐ 10 Mar ‐ 10 Apr ‐ 10 May ‐ 10 Jun ‐ 10 Jul ‐ 10 Aug ‐ 10 Sep ‐ 10 Oct ‐ 10 Nov ‐ 10 Dec ‐ 10 Jan ‐ 11 Feb ‐ 11 Mar ‐ 11 Apr ‐ 11 18

CERN’s of CERN’s offline fline capacity capacity • High-throughput computing based on reliable “commodity” technology: • Scientific Linux • All inclusive: 7’800 dual-socket servers (64’000 cores) • Disk storage: 63’000 TB (usable) on 64’000 drives • Tape storage: 34’000 TB on 45’000 cartridges p g g • 56’000 slots and 160 drives 19

Computer Centre Even CERN has a power problem Even CERN has a power problem We are going to move from 2.9 MW to 3.5 MW. Beyond this we will establish a remote Tier-0 in 2013! y 20

W-LCG: A distributed supercomputer W-LCG: A distributed supercomputer  Compared to TOP10 (Nov. 10) Name/Location Name/Location Core count ore count Tianhe-1 (Tianjin) 186’368 Jaguar (Oak Ridge) 224’162 W-LCG Nebulae – Dawning (NSCS) 120’640 250’000 250’000 Tsubame 2.0 (GSIC, Tokyo) 73’278 IA cores Hopper (DOE/NERSC) 153’408 Tera -100 – Bull (CEA) 138’368 Roadrunner (DOE/LANL) 122’400 Kraken XT5 (Tennessee) Kraken XT5 (Tennessee) 98’928 98 928 Jugene (Jülich) 294’912 Cielo (DOE/SNL) Cielo (DOE/SNL) 107’152 107 152 21

Insatiable appetite for computing  During the era of the LEP accelerator (and beyond)  Compute power doubled every year  We are desperately looking at all opportunities for this to continue 22

CERN openlab  IT Department’s main R&D focus  Framework for collaboration with industry  Evaluation integration validation  Evaluation, integration, validation  of cutting-edge technologies that can serve the LHC Computing Grid  Sequence of 3-year agreements  2003 – 2005: Phase I: the “opencluster” project  2006 – 2011: Phase II & III: dedicated Competence Centres WLCG nlab 0 openlab I openlab II openlab III openlab IV openlab V Other CERN entities Other CERN entities Jan15 Jan03 Jan03 05 05 07 07 09 09 11 11 13 13 10 years of existence 23

CERN openlab structure  A solid set of Competence Centres  With strong support from Management and Communications Automation and Controls CC (Siemens) (Siemens) Communic C Database CC Database CC Manage (Oracle) cations ment Networking CC (HP) Platform CC (Intel) (Intel) 24

EXASCALE Capacity Computing R&D  In openlab, we want to start an R&D In openlab, we want to start an R&D project for Exascale  Project goals:  Identify constraints which might inhibit growth in CERN’s Tier0 and in the W-LCG in the future future.  Understand which software and hardware components must be moved towards the components must be moved towards the Exascale range. 25

Intel’s “Many Integrated Core” Architecture  Announced at ISC10 (June 2010)  S. Jarp on stage with K.Skaugen/Intel  Current version (codename “Knights Ferry SDP”)  Enhanced x86 instruction set with vector extensions  Enhanced x86 instruction set with vector extensions  32 cores + 4-way multithreaded + 512-bit vector units  Successful (easy) porting of our benchmark applications Successful (easy) porting of our benchmark applications  ALICE Trackfitter/Trackfinder  Multithreaded Geant4 prototype p yp  Maximum Likelihood data analysis prototype cs: INTEL Graphic 26

Today's World-wide Today's World-wide Computing Grid for the - PowerPoint PPT Presentation

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing Grid for the Computing Grid for the Lar Large Hadron Collider g e Hadron Collider (WLCG): (WLCG): A A P t A A Petasca P t ascale Fac l F

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Grid Computing for I ndustry Grid Computing for I ndustry Early Applications Early

SEE-GRID-SCI SEE-GRID Infrastructure for Regional eScience www.see-grid-sci.eu International

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Resource virtualization and optimization via Grid and Cloud Computing Moon J Kim IBM Senior

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

Modernizing T&D on the Electric Grid 11/29/2011 Mark Nealon System Meter & Smart Grid

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Grid Computing with Debian, Globus Grid Computing with Debian, Globus and ARC and ARC Mattias

Impact of Advanced Virtualization S. Freitag Technologies on Grid Computing Centers

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th

Grid Computing By William Maples What is Grid Computing? Geographically diverse

Grid! Alison Fulford Housekeeping National Grid 2 Introductions National Grid 3 Workplace

mHealth, Patient Reported Outcomes, & Registries UICC Presentation Bradford Hirsch, MD, MBA

Foundations of AI 3. Solving Problems by Searching Problem-Solving Agents, Formulating

Chapter 3 Solving Problems By Searching 3.1 3.4 Uninformed search strategies CS5811 - Advanced

Being Church in a Digital Age: The Continued Case for Prioritizing Digital Ministry Ryan Panzer

Social and Information Networks Resources Many of the things that we cover are from papers. But

Reminders 12 days until the American election. I voted. Did you? If you havent returned your

ARTIFICIAL INTELLIGENCE Markov decision processes Lecturer: Silja Renooij These slides are part

Compressed sensing off-the-grid: The Fisher metric, support stability and optimal sampling bounds

Sambuz

Useful Links

Newsletter

Mail Us

Today's World-wide Today's World-wide Computing Grid for the - PowerPoint PPT Presentation

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing Grid for the Computing Grid for the Lar Large Hadron Collider g e Hadron Collider (WLCG): (WLCG): A A P t A A Petasca P t ascale Fac l F

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Grid Computing for I ndustry Grid Computing for I ndustry Early Applications Early

SEE-GRID-SCI SEE-GRID Infrastructure for Regional eScience www.see-grid-sci.eu International

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Resource virtualization and optimization via Grid and Cloud Computing Moon J Kim IBM Senior

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

Modernizing T&amp;D on the Electric Grid 11/29/2011 Mark Nealon System Meter &amp; Smart Grid

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Grid Computing with Debian, Globus Grid Computing with Debian, Globus and ARC and ARC Mattias

Impact of Advanced Virtualization S. Freitag Technologies on Grid Computing Centers

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th

Grid Computing By William Maples What is Grid Computing? Geographically diverse

Grid! Alison Fulford Housekeeping National Grid 2 Introductions National Grid 3 Workplace

mHealth, Patient Reported Outcomes, &amp; Registries UICC Presentation Bradford Hirsch, MD, MBA

Foundations of AI 3. Solving Problems by Searching Problem-Solving Agents, Formulating

Chapter 3 Solving Problems By Searching 3.1 3.4 Uninformed search strategies CS5811 - Advanced

Being Church in a Digital Age: The Continued Case for Prioritizing Digital Ministry Ryan Panzer

Social and Information Networks Resources Many of the things that we cover are from papers. But

Reminders 12 days until the American election. I voted. Did you? If you havent returned your

ARTIFICIAL INTELLIGENCE Markov decision processes Lecturer: Silja Renooij These slides are part

Compressed sensing off-the-grid: The Fisher metric, support stability and optimal sampling bounds

Sambuz

Useful Links

Newsletter

Mail Us

Modernizing T&D on the Electric Grid 11/29/2011 Mark Nealon System Meter & Smart Grid

mHealth, Patient Reported Outcomes, & Registries UICC Presentation Bradford Hirsch, MD, MBA