KOLKATA Tier-2@Alice Grid
& Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 & - - PowerPoint PPT Presentation
& Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 & - - PowerPoint PPT Presentation
KOLKATA Tier-2@Alice Grid ALICE GRID & Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 & IN-DAE-VECC-02 VO :- ALICE City:- KOLKATA Country :- INDIA Vikas Singhal VECC, Kolkata KOLKATA Tier-2@Alice Grid Events at LHC Luminosity : 10
KOLKATA Tier-2@Alice Grid
Events at LHC
Luminosity : 1034cm-2 s-1 40 MHz – every 25 ns 20 events overlaying
KOLKATA Tier-2@Alice Grid
CMS ATLAS LHCb
CERN
Tier 0 Centre at CERN
The Grid Computing Model
Tier2 Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x
Tier3 physics department
Desktop Germany
Tier 1
USA UK France Italy Scandinavia CERN Tier 1 Japan
CERN Tier 0
KOLKATA Tier-2@Alice Grid
ALICE computing model
APROC Taiwan France Regional Center Italy Regional Center Germany Regional Center
10Gb/s
Tier 1
100 - 1000 Mb/s
Tier 4
Tier2 Center
1-10 Gb/s
Tier2 Center Tier2 Center Tier2 Center
Kolkata
Tier 2
Institute Institute Institute Institute
Physics data cache 155/622 Mb/s
Tier 3 Tier 0
~40 Gb/s Online System Online Farm CERN Computer Center RAW data delivered by DAQ undergo Calibration and Reconstruction which produce for each event 3 kinds of objects:
- 1. ESD object 2. AOD object 3. Tag object
Further reconstruction and calibration of RAW data will be done at Tier 1 and Tier 2.
DPD (Derived Physics Data) objects will be Processed in Tier 3 and Tier 4.
The generation, reconstruction, storage and distribution of Monte-Carlo simulated data will be the main task of Tier 1 and Tier 2.
This is done in Tier-0 site.
LHC Utilization -- ALICE
ALICE Setup
HMPID Muon Arm TRD PHOS PMD ITS TOF TPC
Indian contribution to ALICE : PMD, Muon Arm
Size: 16 x 26 meters Weight: 10,000 tons
KOLKATA Tier-2@Alice Grid
Total weight 10,000t Overall diameter 16.00m Overall length 25m Magnetic Field 0.4Tesla
ALICE Collaboration
~ 1/2 ATLAS, CMS, ~ 2x LHCb ~1100 people 30 countries, 80 Institutes
The ALICE collaboration & detector
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Data volumes
- RAW data – 2.5 PB/year
- Two distinct periods –
- p+p (~7.5 months) and
- Pb+Pb (~40 days)
- Reconstructed and simulated data
- 1.5PB – first level RAW filtering (ESDs)
- 200TB – second level RAW filtering (AODs)
- 1PB of simulated data
- User generated data ~500TB
- Total ~5 PB of data per year (without replicas)
- Replication 2x RAW, 3x ESD/AODs, 2x user files
Taken from L. Betev Slides in T1-T2 Meeting at Karlsruhe during Jan 2012
KOLKATA Tier-2@Alice Grid
Processing
- RAW data reconstruction ~10K CPU cores
- MC processing ~15K CPU cores
- User analysis ~7K CPU cores (450 distinct users)
- ~40Mio jobs per year
- ~ 1.3 job completed every second
- ½ production, ½ user jobs
- 200 Mio files per year
Taken from L. Betev Slides in T1-T2 Meeting at KIT Taken from L. Betev Slides in T1-T2 Meeting at Karlsruhe during Jan 2012
KOLKATA Tier-2@Alice Grid
KOLKATA TIER-2 @ ALICE
KOLKATA Tier-2@Alice Grid
ALICE Sites on MONALISA
Europe Asia North America Africa
72 active computing sites
South America Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Why Tier 2 ?
- 1. Tier-2 is the lowest level to be accessible by the entire
collaboration.
- 2. Each sub-detector of ALICE has to be associated with
minimum Tier-2 because of large volume of calibration and simulated data.
- 3. PMD is one of the important sub-detectors of ALICE.
- 4. We are solely responsible for PMD – from conception to
commissioning.
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Grid Site As per WLCG & Experiment Requirement
SE (PureXrootD) WNs (More and More WNs) Disks (More and More Disks) WMS MyProxy VOMS…. CREAM-CE Site BDII LCG-UI
KOLKATA Tier-2@Alice Grid
KOLKATA or General Site
Site BDII NFS SERVER Blade 64 bit Servers With Blade Enclosures Disks Arrays (More and More Arrays) Central Services WMS MyProxy VO-BOX CREAM-CE DPM PureXrootD XrootD Redirector XrootD Disk Server Local and Global Network / Fiber Line from Network PBS SERVER DNS SERVER
32or64bit Servers
1U & 2U Servers Few Tower Servers New SAN Box Old NAS Older NAS Even Older DAS UI SERVER Tier3 Manage ment Server and cluster HA SERVER Monitoring Server Installation,DHCP Server etc.. Cooling, UPS Fire Alarm, Access Control etc… HP DELL IBM Etc…
KOLKATA Tier-2@Alice Grid
Frontend component of Site & Installation
LCG-CE SE CREAM-CE Site BDII LCG-UI VO-BOX PURE XrootD
- Grid middleware meta-packages installed through YUM and
configured through YAIM.
- Middleware changed time to time like
GLITE EMI. (follow manual)
- During Kolkata Site installation and configuration we
experienced about RPM dependencies with JAVA, Security packages etc.
- Community and mailing list helps a lot. For most of the
problem we got the solution from mailing list.
- Thanks to APROC, Taiwan for helping at each stage
KOLKATA Tier-2@Alice Grid
Middleware installed on IN-DAE-VECC-02 Site
1.Installed SLC 5.8 (x86_64) operating system on x86_64 Machine.
- 2. Upgrading below middleware packages to EMI middleware.
glite-VOBOX CREAM-CE (64bit) glite-BDII Pure XROOTD Redirector as Storage Element glite-WN (64bit)
grid01.tier2-kol.res.in gridce02.tier2-kol.res.in dcache-server.tier2-kol.res.in For 79 Worker Nodes (476 core) wn045-wn123.internal.tier2-kol.res.in Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Backend Component of SITE
Router & Switch 2 networks, one Public Network and another Private network. Domain Name Server DNS server is critical component. We have 2 redundant Name servers Naamak & suchak for High Availability. Time Server Configured NTP protocol Installer Using Network installation and Automated configuration Quattor like tools. Storage Server Using NFS mounted Common shared space PBS Server CE & PBS batch scheduler on a
- Server. Configured Firewall (through
iptables) and did NAT ing on it. TIER-3 Cluster Separate cluster for local users with Interactive and non interactive nodes. Monitoring Server Configured MRTG (Network Traffic Monitoring) and cluster monitoring tool.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Doing Preventive Maintenance Once in a Year
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Kolkata TIER-2 centre logical diagram
Router Switch gridce02 Backup-server wn045 wn046
wn122 wn123
Switch-1 Switch-2
Internet
300Mbps
Computing Nodes
25 Nodes Dell and Wipro Blades Cluster with 25 TB of As Tier-3 192.168.x.x (Stand by) 144.16.112.xx/27 130 TB Backup grid Grid-peer gridse001 wn001 wn002
wn024 wn025
Switch-1 Switch-2
GRID-PEER Tier-3 cluster with 32 & 64 bit machine
Computing Nodes
192.168.x.x (Stand by)
IN-DAE-VECC-02 Site with 64 bit machine
Installer DELL and HP Blade Server with Multi Core Xeon 3.0 GHz naamak suchak grid01 dache-server 4 – Xrootd Disk Servers Consisting of 230 TB of IBM And HP SAN system SINP 1Gbps Fiber Backbone
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
ALICE Tier-2 Grid Started in 2002
CERN
512Kbps Ethernet Bandwidth
Operating System
›
Scientific Linux 3.05
Middleware
›
Alice Environment with PBS as batch system
Hardware (CPU, Disk)
›
1xDuel Xeon,4GB Compute Node
›
2xDuel Xeon,2GB WNs
›
2x80GB Disk Space
Bandwidth
›
512Kbps Shared
- S. K. Pal & T. Samanta
Started in 2002.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 2 Core to 700 Cores
Started with
- ---2 Desktop Machine
2002
- ---2 Tower Like Servers
2003
- ---9 HP 1U Servers
2004
- ---17 Wipro 1U Servers Single Core
2006
- ---40 HP Blades Dual Core
2008
- ---8 HP Blades Quad Core
2009
- ---32 Dell Bladed Dual Processor Dual Core
2011
- ---GPU Server with Tesla 2070 with 448Cores
2012
KOLKATA Tier-2@Alice Grid
Kolkata Tier2 on Monalisa
Vikas Singhal, VECC, INDIA
2007 2011 2009 2010
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 512MB Disk to 300TB Disk
Started with
- ---512MB in Desktop Machine
2002
- ---40GB in Tower Like Servers as DAS
2003
- ---400GB in HP MSA 500
2004
- ---2TB Wipro NAS
2006
- ---108TB HP EVA SAN
2008
- --- 25 TB i-scsi
2009
- ---200TB IBM DS 5100
2011
- ---2TB Hard disk in GPU Server
2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
2006 2008 2010 2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 128Kbps to 1Gbps Disk
Started with
- ---128Kbps shared link
2002
- ---512Kbps
2003
- ---2Mbps Dedicated Link
2004
- ---4Mbps from Bharti
2006
- ---30Mbps from Reliance
2008
- ---100Mbps from VSNL (ERNET)
2009
- ---300 Mbps from NKN
2011
- ---Upgrading with 1Gpbs
2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Efficient Cooling Concept and Implementation
- Hot and Cool Air is separated.
- For air separation, Cold Air Containment is
created.
- Cold Air Containment is least accessible
Area.
- Cool only hardware racks, not human, walls
etc.
- Human intervention to Cold Aisle
Containment is restricted.
- All the management and monitoring of the
server, storage is from outside Cold Aisle Containment.
- All the power and Ethernet cables are also
from outside Cold Aisle Containment.
- Temperature gradient between Cold and
Hot aisle is 5oC
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Kolkata Tier-2 After renovation
KOLKATA Tier-2@Alice Grid
Major Achievements
Vikas Singhal, VECC, INDIA Consistently more than 400 ALICE Jobs are running after Commissioning
- f the efficient
Cooling Solution.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
- Kolkata Tier-2
provided total 6.0K HEP SPEC2006 CPU and 230TB of Disk Storage. Achieved pledged resources
KOLKATA Tier-2@Alice Grid
1M ALICE Job completed during Last Year
Vikas Singhal, VECC, INDIA Performance: ~1M jobs successfully completed during last
- ne year
Jobs completed Time ->
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Total Kolkata Tier-2 Resources
- Computing Resources:-
Total :- 476 Cores DELL Blades 32 * 8 = 256 HP Quad Core Blades 8*8= 64 HP Dual Core Blades 39 * 4 = 156
- Storage :- 230TB under one HP 2U Management Server
74TB : HP EVA 6100 under 2 * 2U HP disk server 156TB : IBM DS 5100 under 2 * 1U IBM disk server
- 300Mbps Network speed. It will be increased upto 1Gbps
during this year.
KOLKATA Tier-2@Alice Grid
After NKN Network, Speed Increased to 300 Mbps
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Grid-Peer Tier-3 Cluster
1U Sliding LCD Monitor with 16 port KVM
- Dell(TM) PowerEdge(TM) M1000e Blade Server
Chassis.
- 16 Number of Dell(TM) PowerEdge(TM) M610 High
Performance Intel Blade
- Each blade has latest Nehalem based 2 * Intel Quad
Core E5530 Xeon 2.4GHz CPU with 8MB cache.
- Each blade has16GB RAM.
- Each blade has 2 * 146GB Mounted as RAID1.
- Installed SLC 5.6 x86_64 OS (kernel version 2.6.18-
164.6.1.el5).
- Dell™ ISCSI EqualLogic Storage
- 16 * 2TB SAS hard disks.
- 24.88TB Usable space after RAID5 and Hot Spare.
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Total 25 Nodes for VECC users and PMD Collaborators. 12 32bit nodes 13 64bit computing nodes 32 bit nodes are on oldest hardware procured in 2004 (slowly we will deprecate them as High noise, power and Heat Generation.). 25 TB of Total storage. 50 + active users (across India.) 30 + active users (in VECC.) Quota implemented. Root, Geant3, Aliroot, Alien, Fortran etc user specific software installed according to hardware like 32 bit and 64 bit. Extensively used by the users, need to extend.
Grid-Peer Tier-3 Cluster cont…
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Intra-DAE Grid EU-India Grid Health Grid IGCA GARUDA Grid
Bi-product of WLCG GRID
KOLKATA Tier-2@Alice Grid
Thank You
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Supporting Slides
KOLKATA Tier-2@Alice Grid
Main data types in ALICE
- ESD – run/event numbers, trigger word, primary vertex, arrays of
tracks/vertices, detector info
- AOD standard – cleaned-up ESD’s, reducing the size by a factor of 5
–
Can be extended on user demand with extra information
- ESD and AOD inheriting from the same base class (keep same event interface)
Raw data
Conditions Calibration Alignment data
AliRoot RECONSTRUCTION
OCDB (updated by pass0 -passN AliEn FC Event Summary Data Pass1 – T0 Event Summary Data Pass2 – T1 Event Summary Data PassN – T1 ESD filtering
AOD standard
Analysis + extra Analysis Analysis Vikas Singhal, VECC, INDIA Monte Carlo
KOLKATA Tier-2@Alice Grid
Site ALICE central services
Job submission
Job 1 lfn1, lfn2, lfn3, lfn4 Job 2 lfn1, lfn2, lfn3, lfn4 Job 3 lfn1, lfn2, lfn3 Job 1.1 lfn1 Job 1.2 lfn2 Job 1.3 lfn3, lfn4 Job 2.1 lfn1, lfn3 Job 2.1 lfn2, lfn4 Job 3.1 lfn1, lfn3 Job 3.2 lfn2
Optimizer
AliEn CE
WMS CE WN
Env OK? Die with grac e
Execs agent
Sends job agent to site Yes No Close SE’s & Software Matchmaking Receives work-load Asks work-load Retrieves workload Sends job result Updates TQ Submits job User ALICE Job Catalogue
VO-Box LCG User Job ALICE catalogues
Registers
- utput
lfn guid {se’s} lfn guid {se’s} lfn guid {se’s} lfn guid {se’s} lfn guid {se’s}
ALICE File Catalogue
packman
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Xrootd architecture
Client Redirector
(Head Node)
Data Servers
- pen file X
A B C
go to C Who has file X?
Cluster Client sees all servers as xrootd data servers All storages are on WAN
2nd open X go to C Redirectors Cache file location
Global redirector (not in picture) – intra-site storage collaboration
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Grid security (in a nutshell!)
Important to be able to identify and authorise users
- Possibly to enable/disable certain actions
Using X509 certificates
- The Grid passport, delivered by a certification authority. (IGCA for India)
For using the Grid, create short-lived “proxies”
- Same information as the certificate
- … but only valid for the time of the action
Possibility to add “group” and “role” to a proxy
- Using the VOMS extensions
- Allows a same person to wear different hats (e.g. normal user or
production manager)
Your certificate is your passport, you should sign whenever you use it,
don’t give it away!
- Less danger if a proxy is stolen (short lived)
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
The VOBOX
The VOBOX is a WLCG service developed in 2006 to provide the experiments with a service to:
a)
Run their own services.
b)
In addition it also provides file system access to the experiment software area.
The concept of VOBOX is not the same for the 4 LHC experiments
a)
ALICE requires the STANDARD WLCG VOBOX
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Storage strategy
WN
SE head node xrootd
(manager)
MSS
xrootd (worker)
Disk
SRM xrootd (worker)
DPM
xrootd (worker)
Castor
SRM SRM
MSS
xrootd emulation (worker)
dCache
SRM
DPM, CASTOR, dCache are LCG- developed SEs, xrootd is entering as a strategic solution
Old implement ation Current version 2.1.8 Working, but severe limits with multiple clients
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
What is MonALISA ?
Caltech project started in 2002 http://monalisa.caltech.edu/ Java-based set of distributed, self-describing services Offers the infrastructure to collect any type of information Can process it in near real time The services can cooperate in performing the monitoring tasks Can act as a platform for running distributed user agents
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
MonALISA software components and the connections between them
Data consumers Multiplexing layer Helps firewalled endpoints connect Registration and discovery
JINI-Lookup Services Secure & Public MonALISA services Proxies Clients HL services Agents Network of
Data gathering services
Fully Distributed System with no Single Point of Failure
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
PROOF
Parallel ROOT Facility Interactive parallel analysis on a local cluster
Parallel processing of (local) data Fast Feedback Output handling with direct visualization
PROOF is part of ROOT
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
root Remote PROOF Cluster Data root root root
Client – Local PC ana.C stdout/result
node1 node2 node3 node4 ana.C
root
PROOF Schema
Data
Proof master Proof slave
Result Data Result Data Result Result Vikas Singhal, VECC, INDIA