Virtualized HPC infrastructure of the Novosibirsk Scientific Center - - PowerPoint PPT Presentation

virtualized hpc infrastructure of the novosibirsk
SMART_READER_LITE
LIVE PREVIEW

Virtualized HPC infrastructure of the Novosibirsk Scientific Center - - PowerPoint PPT Presentation

Virtualized HPC infrastructure of the Novosibirsk Scientific Center for HEP data analysis D. Maximov on behalf of NSC/SCN consortium International Workshop on Antiproton Physics and Technology at FAIR Budker INP, Novosibirsk, Russia 19


slide-1
SLIDE 1

Virtualized HPC infrastructure of the Novosibirsk Scientific Center

for HEP data analysis

  • D. Maximov
  • n behalf of NSC/SCN consortium

International Workshop on Antiproton Physics and Technology at FAIR

Budker INP, Novosibirsk, Russia

19 November 2015

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 1 / 24

slide-2
SLIDE 2

Contents

1

History

2

Supercomputing Network of the Novosibirsk Scientific Center

3

Dynamical Virtualized Computing Cluster

4

Experiments integrated with GCF

5

Results and conclusion

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 2 / 24

slide-3
SLIDE 3

BINP/GCF — History

started in 2004 initial goal: participate in LCG project currently: a gateway to the NSC computing resources for BINP experimental groups.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 3 / 24

slide-4
SLIDE 4

Supercomputing Network of the Novosibirsk Scientific Center

Isolated 10 Gbps network connecting main computing resources of Akademgorodok Organizations involved: Institute of Computational Technologies (ICT SB RAS) Novosibirsk State University (NSU) Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG SB RAS) Budker Institute of Nuclear Physics (BINP SB RAS) Expansion perspectives: other NSC institutes, Tomsk State University

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 4 / 24

slide-5
SLIDE 5

Supercomputing Network of the NSC

stark Belle2 ATLAS scTau

BaBar

KEDR CMD-3 SND

computing cluster

BINP/ GCF

NUSC SSCC SB RAS TSU SKIF- Cyberia

CMS LHCb

BINP network core and storage

plasma Panda

  • thers

SB RAS network core

NSC/SCN network core CERN KEK broadband providers Remote cites Budker INP NSC Supercomputing Network

ICT storage system

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 5 / 24

slide-6
SLIDE 6

Novosibirsk State University (NSU) Supercomputer Center (NUSC)

http://nsu.ru http://nusc.ru

29 TFlops (2432 physical CPU cores) + 8 TFlops (GPUs), 108 TB of storage

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 6 / 24

slide-7
SLIDE 7

Siberian Supercomputer Center (SSCC) at the Institute of Computational Mathematics & Mathematical Geophysics (ICM&MG)

SSCC was created in 2001 in order to provide computing resources for SB RAS research organizations and the external users (including the ones from industry) NKS-30T 30 + 85 TFlops of combined computing performance since 2011Q4 (CPU + GPU)

http://www2.sscc.ru

2x 70 sq.m of raised floor space Up to 140kVA power input & 120kW

  • f heat removal capacity (combined)

120 TB of storage

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 7 / 24

slide-8
SLIDE 8

CPU/Cores: 128/512, 5 TFlops Peak

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 8 / 24

slide-9
SLIDE 9

Key properties of the HEP computing environment

Each experiment has unique computing environment

◮ wide variety OS and standard packages versions ◮ a lot of specifically developed software

Software can be easily parallelized by data Mostly non interactive programs, executed via some batch system.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 9 / 24

slide-10
SLIDE 10

How to glue HEP and HPC together?

We want:

  • n the HEP side:

keep the specific computing environment and user’s experience

  • n the supercomputer side:

be like a normal SC user The answer is: run HEP tasks inside virtual machines, run VMs inside supercomputer’s batch system jobs.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 10 / 24

slide-11
SLIDE 11

Batch System Integration Mechanisms

BINP NSU NSC/SCN

BINP/GCF NUSC

SND Detector User Group ATLAS Data Analysis Group

Orchestration Services

KEDR Detector User Group SGE Batch System

Orchestration Services

PBS Pro Batch System

Orchestration Services

SGE Batch System SGE Batch System

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 11 / 24

slide-12
SLIDE 12

Batch System Integration Mechanisms

BINP NSU NSC/SCN

BINP/GCF NUSC

SND Detector User Group ATLAS Data Analysis Group

Orchestration Services

KEDR Detector User Group SGE Batch System

Orchestration Services

PBS Pro Batch System

Orchestration Services

SGE Batch System SGE Batch System

STAGE 1 Job submission and automated VM group deployment sequence

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 11 / 24

slide-13
SLIDE 13

Batch System Integration Mechanisms

BINP NSU NSC/SCN

BINP/GCF NUSC

SND Detector User Group ATLAS Data Analysis Group

Orchestration Services

KEDR Detector User Group SGE Batch System

Orchestration Services

PBS Pro Batch System

Orchestration Services

SGE Batch System SGE Batch System

STAGE 1 Job submission and automated VM group deployment sequence

Set of computing nodes with KVM, IPoIB & HT support enabled on demand Group of VMs

  • f particular

type created on demand

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 11 / 24

slide-14
SLIDE 14

Batch System Integration Mechanisms

BINP NSU NSC/SCN

BINP/GCF NUSC

SND Detector User Group ATLAS Data Analysis Group

Orchestration Services

KEDR Detector User Group SGE Batch System

Orchestration Services

PBS Pro Batch System

Orchestration Services

SGE Batch System SGE Batch System

STAGE 1 Job submission and automated VM group deployment sequence

Set of computing nodes with KVM, IPoIB & HT support enabled on demand Group of VMs

  • f particular

type created on demand STAGE 2 Handling late stages of VM deployment, configuring network storage layout STAGE 3 Automated VM group discovery and analysis job submission, VM performs a self shutdown when no suitable pending jobs left

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 11 / 24

slide-15
SLIDE 15

Batch System Integration Mechanisms

BINP NSU NSC/SCN

BINP/GCF NUSC

SND Detector User Group ATLAS Data Analysis Group

Orchestration Services

KEDR Detector User Group SGE Batch System

Orchestration Services

PBS Pro Batch System

Orchestration Services

SGE Batch System SGE Batch System

STAGE 1 Job submission and automated VM group deployment sequence

Set of computing nodes with KVM, IPoIB & HT support enabled on demand Group of VMs

  • f particular

type created on demand STAGE 2 Handling late stages of VM deployment, configuring network storage layout STAGE 3 Automated VM group discovery and analysis job submission, VM performs a self shutdown when no suitable pending jobs left STAGE 4 Computing nodes are returned to their original state

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 11 / 24

slide-16
SLIDE 16

Virtualized computing infrastructure

In this way we have dynamical virtualized computing cluster (DVCC). Physicists use computing resources in a conventional way.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 12 / 24

slide-17
SLIDE 17

Experiments at Budker INP integrated with GCF

High Energy Physics Local

◮ KEDR ◮ CMD-3 ◮ SND ◮ Super-cτ (planned)

External

◮ ATLAS ◮ Belle2 ◮ BaBar

Other activities Plasma & accelerator physics, engineering calculation. . .

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 13 / 24

slide-18
SLIDE 18

Virtualized infrastructure: what we’ve learnt so far

HEP data analysis could be successfully performed using the virtualized HPC infrastructure of the Novosibirsk Scientific Center Long term VM stability obtained (up to a month at NUSC, up to year at BINP) Most of the underlying implementation details are completely hidden from the users. No changes were required for experimental group’s software and/or its execution environment.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 14 / 24

slide-19
SLIDE 19

Virtualized infrastructure: what we’ve learnt so far (2)

Main benefits: Ability to use free capacity of supercomputer sites in order to run much more simple (from HPC point of view) single threaded HEP software. Ability to freeze software of an experimental group and its execution environment and (exactly!) reproduce it when needed. This scheme could be easily extended for other experimental groups and computing centers.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 15 / 24

slide-20
SLIDE 20

Milestones

Initial deployment of GCF at BINP in 2004. DVCC middleware development started in 2010. KEDR runs production jobs at NUSC since 2011Q1. Other BINP groups joined the activity in 2012Q1. (Started by ATLAS) SSCC connected in August 2012, in production since 2013Q1. Belle2 experiment joined in 2014Q2. Works through the DIRAC system.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 16 / 24

slide-21
SLIDE 21

Usage of the local cluster

Average Utilization (last year) — 49%.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 17 / 24

slide-22
SLIDE 22

Participation in Belle2

Produced — about 3% CPU-hour for simulation.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 18 / 24

slide-23
SLIDE 23

Conclusion

Results NSC/SCN resources are successfully used for HEP data processing in BINP making analysis cycles 10-100 times faster. Our experience could be applied for other computing centers and HEP experiments.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 19 / 24

slide-24
SLIDE 24

Plans

Install new computing and storage hardware Extend the number of user groups Further development of the DVCC middleware. Deploy special storage for BaBar experiment. Access to others computing centers Present our resources to LCG

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 20 / 24

slide-25
SLIDE 25

Thank you for attention!

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 21 / 24

slide-26
SLIDE 26

Key features of the virtualized infrastructure

Virtualization by KVM VM disk images are located on SC’s file system Input/output data are located at BINP/GCF and accessed by VMs

  • ver the network.

VMs are just regular batch tasks at a supercomputer. VMs are started automatically on user’s demands.

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 22 / 24

slide-27
SLIDE 27

Results from KEDR detector

Measurement of main parameters

  • f the ψ(2s) resonance

(http://arxiv.org/abs/1109.4215) Measurement of ψ(3770) parame- ters (http://arxiv.org/abs/1109.4205)

The multihadron cross section as a function of the c.m. energy for three scans in the ψ(2s) region Cross section of e+e− → hadrons vs. c.m. energy in the vicinity of ψ(3770)

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 23 / 24

slide-28
SLIDE 28

ATLAS analysis

  • G. Aad et al. (ATLAS Collaboration) “Search for heavy neutrinos and

right-handed W-bosons in events with two leptons and jets in pp collisions at sqrt(s)=7 TeV with the ATLAS detector”, Eur.Phys.J. C72 (2012) 2056. Experimentally observed and expected from MC simulation 95% CL limits

  • n Majorana neutrino mass MN and right-handed W -boson MWR.

Dirac heavy neutrino Majorana heavy neutrino

  • D. Maximov (BINP)

HEP data analysis (FAIR-2015) 19.11.2015 24 / 24