HTCondor at HEPiX, WLCG and CERN – Status and Outlook
HTCondor at HEPiX, WLCG and CERN
Helge Meinhard / CERN HTCondor week 2018 Madison (WI) 22 May 2018 CERN material courtesy by Ben Jones
22 May 2018 2
HTCondor at HEPiX, WLCG and CERN Status and Outlook Helge Meinhard - - PowerPoint PPT Presentation
HTCondor at HEPiX, WLCG and CERN Status and Outlook Helge Meinhard / CERN HTCondor week 2018 Madison (WI) 22 May 2018 CERN material courtesy by Ben Jones 22 May 2018 HTCondor at HEPiX, WLCG and CERN 2 HEPiX From our Web site
HTCondor at HEPiX, WLCG and CERN
Helge Meinhard / CERN HTCondor week 2018 Madison (WI) 22 May 2018 CERN material courtesy by Ben Jones
22 May 2018 2
“The HEPiX forum brings together worldwide Information Technology staff, including system administrators, system engineers, and managers from the High Energy Physics and Nuclear Physics laboratories and institutes, to foster a learning and sharing experience between sites facing scientific computing and data challenges.”
spring 2019 San Diego (CA); autumn/fall 2019 Amsterdam (The Netherlands)
HTCondor at HEPiX, WLCG and CERN 22 May 2018 3
(“opportunistic usage”)
HTCondor at HEPiX, WLCG and CERN 22 May 2018 4
Site Batch scheduler CERN See later BNL HTCondor FNAL HTCondor KIT HTCondor Nordic T1 Slurm CC-IN2P3 UGE, considering HTC RAL HTCondor Nikhef PBS PIC Migration to HTC 60% done CNAF Migration to HTC started
HTCondor at HEPiX, WLCG and CERN
Site Batch scheduler US T2 Mostly HTCondor LBNL Slurm IHEP HTCondor, (Slurm) DESY HTCondor, (Slurm) FZU Migration to HTCondor
U T
LSF CSCS Slurm GRIF HTCondor CoEPP HTCondor
22 May 2018 5
HTCondor at HEPiX, WLCG and CERN 22 May 2018 6
LSF, 75000 LSF, 46000 HTCondor, 20000 HTCondor, 185000 50000 100000 150000 200000 250000 2016 2018
Cores
HTCondor at HEPiX, WLCG and CERN 22 May 2018 7
HTCondor at HEPiX, WLCG and CERN 22 May 2018 8
HTCondor at HEPiX, WLCG and CERN
Almost… Are we there yet?
22 May 2018 9
HTCondor at HEPiX, WLCG and CERN
More Than
More Than
physics jobs per day Batch:
Of cloud capacity
22 May 2018 10
Grid Local Authentication X509 Proxy Kerberos Submitters LHC experiments, COMPASS, NA62, ILC, DUNE… Local users of experiments, Beams, Theorists, AMS, ATLAS Tier-0 Submission method Submission frameworks: GlideinWMS, Dirac, PanDA, AliEn From condor_submit by hand, to complicated DAGs, to Tier-0 submit frameworks. Storage Grid protocols. SRM, XRootD… AFS, EOS
HTCondor at HEPiX, WLCG and CERN 22 May 2018 11
HTCondor at HEPiX, WLCG and CERN
20 40 60 80 100 120 140 160 Run 1 Run 2 Run 3 Run 4
GRID ATLAS CMS LHCb ALICE 22 May 2018 12
HTCondor at HEPiX, WLCG and CERN 22 May 2018 13
HTCondor at HEPiX, WLCG and CERN 22 May 2018 14
HTCondor at HEPiX, WLCG and CERN 22 May 2018 15
https://indico.cern.ch/event/676324/contributions/2981816/
HTCondor at HEPiX, WLCG and CERN 22 May 2018 16
HTCondor at HEPiX, WLCG and CERN 22 May 2018 17
2015
2016
2nd
Mar .
23rd
Mar .
6th
Nov .
20th
Nov .
1st
Aug.
2015
2016
2nd
Mar .
23rd
Mar .
6th
Nov .
20th
Nov .
1st
Aug.
HTCondor at HEPiX, WLCG and CERN 22 May 2018 18
HTCondor at HEPiX, WLCG and CERN 22 May 2018 19
across clouds
have integrated GCE, Azure, AWS, CERN…
https://kccnceu18.sched.com/event/Duoa
engine or just IAAS
container layer batch team
HTCondor at HEPiX, WLCG and CERN 22 May 2018 20
Sched Negotiator Collector Host kubefed init fed --host-cluster-context=condor-host ...
HTCondor at HEPiX, WLCG and CERN 22 May 2018 21
kind: DaemonSet ... hostNetwork: true containers:
image: .../cloud/condor-startd command: ["/usr/sbin/condor_startd", "-f"] securityContext: privileged: true livenessProbe: exec: command:
Schedd Negotiator Collector Host StartD ... StartD ... StartD ... kubefed init fed --host-cluster-context=condor-host ... kubefed join --context fed tsystems \
HTCondor at HEPiX, WLCG and CERN 22 May 2018 22
HTCondor at HEPiX, WLCG and CERN 22 May 2018 23