Computing Christian Zeitnitz Bergische Universitt Wuppertal - - PowerPoint PPT Presentation

computing
SMART_READER_LITE
LIVE PREVIEW

Computing Christian Zeitnitz Bergische Universitt Wuppertal - - PowerPoint PPT Presentation

Computing Christian Zeitnitz Bergische Universitt Wuppertal Overview Organization Resource Utilization Plans for 2015 and beyond Summary Overview WLCG resources pledges in 2014 o CPU: ~190.000 CPU cores o Disk: ~180


slide-1
SLIDE 1

Computing

Christian Zeitnitz

Bergische Universität Wuppertal

  • Overview
  • Organization
  • Resource Utilization
  • Plans for 2015 and beyond
  • Summary
slide-2
SLIDE 2
  • WLCG resources pledges in 2014
  • CPU: ~190.000 CPU cores
  • Disk: ~180 PByte
  • German Share
  • Tier-1: 15%
  • Tier-2: 10%
  • Shares in Germany
  • Tier-1: ~60%
  • Tier-2: ~40%
  • Tier sites
  • Tier-1: KIT
  • Tier-2
  • HGF: DESY, GSI
  • MPI für Physik (MPP)
  • Universities: Aachen, Freiburg,

Göttingen, Munich, Wuppertal

Overview

2

  • C. Zeitnitz - Computing
slide-3
SLIDE 3

German WLCG-Sites

  • Tier-1 at KIT
  • 4 LHC-Experiments, Compass, Auger
  • Previously Belle, D0, CDF
  • One of the biggest Tier-1 in the WLCG
  • Tier-1 for A, CH, CZ, PL, SK
  • Main tasks: data (re-)processing and

distribution

  • Tier-2
  • Main tasks: end user analysis, simulation
  • Additional resources for analysis
  • National Analysis Facility
  • DESY (4000 cores, 1PB)
  • GSI (ALICE)
  • Network
  • LHCOne to connect Tier-2 and Tier-1 sites
  • Personnel financed by sites (technical) and

funding agencies (experiment specific)

3

  • C. Zeitnitz - Computing
slide-4
SLIDE 4

Organization

  • Cooperation between experiments and computing centers

works extremely well

  • Three bodies
  • Overview Board of the Tier-1
  • KIT, funding agencies, experiments, users
  • “Technical Advisory Board“ of the Tier-1
  • Technical decisions and coordination between experiments and Tier-1
  • Find solutions across experiments
  • Grid Project Board of the Terascale Alliance
  • General discussions about the future development
  • Initiate projects across collaborations
  • Organize workshops
  • The “GridKa School of Computing” was born already 2003
  • ut of this collaboration

4

  • C. Zeitnitz - Computing
slide-5
SLIDE 5

Resource Utilization in Germany

  • German Tier sites operated very

successfully and with high reliability

  • Very high load on Tier-1 and Tier-2
  • ver the last years
  • Tier-2 sites delivered substantially

(up to 200%) more CPU resources than pledged

  • Reason
  • More CPU required for analysis as

well as for simulation

  • Source of CPU cycles
  • unused Tier-3 resources or from
  • ther communities operated within

the same cluster

Resource usage 7/2012-6/2013

5

  • C. Zeitnitz - Computing
slide-6
SLIDE 6

CPU Shares

  • German Tier-1 and Tier-2

7/2012-6/2014

6

  • C. Zeitnitz - Computing
slide-7
SLIDE 7

DESY NAF Utilization

  • National Analysis Facilities
  • End user analysis
  • Direct connection to Tier-2 for good data access

NAF CPU usage by institutes

  • Dec. 2009 – Apr. 2013

DESY 31% Uni-HH 30% (partially own resources) +16 German Institutes (39%)

7

  • C. Zeitnitz - Computing
slide-8
SLIDE 8

What are the CPUs used for?

100,000 Jobs Simulation Analysis 30,000 Jobs ATLAS Jan-Jul 2012

  • Simulation dominates CPU usage on Tier-1/2 centers (~60%)
  • Analysis ~20-30%
  • Requires high data throughput → fast disk systems
  • 2/3 of hardware investment spend on disk systems
  • Utilization varies substantially
  • Not included: NAF and Tier-3 usage

8

  • C. Zeitnitz - Computing
slide-9
SLIDE 9
  • Installed, pledged and required resources
  • Up to 2013: installed resources
  • 2014/15: Computing-Resources Review Board C-RRB April 2014
  • 2016-19: 20% increase per year for CPU and 15% for disk resources (flat budget)
  • Resource requirements assume already a substantial optimization of the

computing models and software efficiency

Future Development of Computing Resources

9

  • C. Zeitnitz - Computing
slide-10
SLIDE 10

Development of the LHC-Computing

  • Computing models are changing
  • Network performance substantially better than expected
  • More data copied between Tier-centers
  • System becomes less hierarchical
  • Computing gets more complex and less plannable
  • Computing Technology
  • Adapt to technological advances in hard- and software
  • Utilization of vectorization, parallization on standard CPUs and GPUs
  • Utilize Cloud and HPC systems
  • Optimize components for data storage and distribution
  • Need to improve the overall efficiency by approximately a

factor of 4

  • Computing resources will increase substantially slower than the amount of

data!

  • German groups are contributing to these activities

10

  • C. Zeitnitz - Computing
slide-11
SLIDE 11

Current Plans for 2015-2019 in Germany

  • General concept
  • Operation cost and technical personnel are provided by the corresponding

institution of the Tier-center (approx. 50% of the total cost)

  • Funds for hardware and experiment specific tasks funded by external sources
  • Tier-1 (GridKa at KIT) and Tier-2/NAF at DESY and GSI
  • Resource increase according to requirements scrutinized by the C-RRB
  • Financed by the Helmholtz Association
  • base funds
  • extra investment funds – not secured yet (application in 2014)
  • Tier-2 at MPI für Physik
  • Resource increase according to requirements scrutinized by the C-RRB
  • Investment funds secured
  • Tier-2 at 5 University sites
  • Very heterogeneous financing in the past (mainly Helmholtz Association and BMBF)
  • No hardware investment since 2013
  • Currently no funding source for hardware for the upcoming years

11

  • C. Zeitnitz - Computing
slide-12
SLIDE 12

Options outside the Tier-Structure

  • Data Analysis only possible with very high performance disk systems
  • Only Tier-1, Tier-2 and NAF centers are suitable
  • Simulation could be run on different resources
  • Opportunistic usage of Tier-3 resources
  • Resources at Tier-2 sites already used
  • Non-Tier-sites
  • Might be possible to cover up to 10-15% of the CPU requirement
  • Not plannable, hence partially unreliable
  • HPC and Cloud resources
  • Test beds exist in the WLCG as well as in Germany
  • Cloud still too expensive
  • All options require substantial personnel for development and operation
  • Could provide up to 20% of CPU resource, but no solution for shortage of

analysis resources

12

  • C. Zeitnitz - Computing
slide-13
SLIDE 13

Summary

  • Computing was essential for the strong contribution of the

German groups to the success of the LHC experiments

  • The Germans Tier-sites are among the most reliable worldwide

and provided substantially more CPU cycles than pledged

  • Very good cooperation within the German computing community

was essential for this success

  • Future development of the Tier-centers
  • Operation cost and technical personnel are secured
  • Tier-1 and some Tier-2 (DESY, GSI) will apply for funds from the Helmholtz

Association

  • No visible funding option for the Tier-2 at the Universities
  • Looking at different possibilities to find at least part of the required

resources (Tier-3, HPC, Cloud …) – No solution for analysis

  • German groups need a reliable computing for the upcoming

LHC Run

13

  • C. Zeitnitz - Computing
slide-14
SLIDE 14

Backup

14

  • C. Zeitnitz - Computing
slide-15
SLIDE 15

Financing of the Tier Centers in the Past

  • Tier-1
  • Investment: ca. 4 M€/year (BMBF)
  • Operation and personnel: ca. 3,8 M€ (HGF PoF I and II)
  • Tier-2 and NAF
  • Average over 2008-12
  • Investment: ca. 2,9M€/year
  • Operation and personnel: ca. 2,6 M€/year (DESY, GSI, MPP, Universities)
  • Additional personnel
  • Experiment specific tasks
  • 19,5 FTE
  • Financed by:

BMBF, DESY, KIT, GSI, MPP

  • GRID Tools Development
  • ~10 FTE
  • Financed by:

Terascale Allianz, DESY, KIT, GSI, BMBF

15

  • C. Zeitnitz - Computing
slide-16
SLIDE 16

NAF Utilization at DESY

University of Hamburg owns part of the NAF resources, hence the large share

16

  • C. Zeitnitz - Computing

mpp