Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
ALBA Synchrotron Data analysis IT infrastructure status report IT - - PowerPoint PPT Presentation
ALBA Synchrotron Data analysis IT infrastructure status report IT - - PowerPoint PPT Presentation
ALBA Synchrotron Data analysis IT infrastructure status report IT Systems - Computing 23/05/2017 Antoni Prez ALBA DA / DaaS infrastructue Overview HPC Infrastructure DaaS 1 st use case: BL01 MIRAS Xen Desktop Windows VDI with OPUS
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
Overview
- HPC Infrastructure
- DaaS 1st use case: BL01 MIRAS
Xen Desktop Windows VDI with OPUS 7.5 & Unscrambler X10.3
- Pilot containers use:
Kubernetes for Controls Continuous Integration,KPI Dashboards apache
- Géant Cloud Framework
- Configuration Management with “Salt Stack”
- Orchestrators
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
HPC infrastructure
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
HPC infrastructure
12 nodes w/ CentOS 7 - managing jobs with Slurm
- 11 nodes CPU - 1 node GPU (NVIDIA Tesla P100 16GB)
- 2,4 Tflops of computing capacity
- 216 CPU cores with 1408 GB RAM
- 1 Gbps and 10 Gbps Ethernet connections
- MPI 3.0
- 40TB distributed BeeGFS scratch space in 2 nodes.
- CLAs ( CLuster Access VMs )
– Dedicated VMs by users group – HPC isolated management network with HPC nodes and CLAs (through VyOS virtual router)
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
HPC infrastructure
Remarks
- Using Spack package management tool for
installing software w/ packages above installed distro Next steps:
- Study to offer Remote access to CLAs as DaaS
- Renew and increase nodes infrastructure capacity
- Identify useful use cases for Docker containers vs Spack
- Evaluate use of rCUDA
(remote CUDA - http://www.rcuda.net/ - Universitat de València)
– Runs same CUDA code. GPU nodes share on Slurm w/ 2-10% loss – Remote GPUs can be concurrently assigned to several VMs
- Evaluate possible FPGA use in processing acceleration for last Deep Learning
BL requests to come
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
DaaS: BL01 MIRAS
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
DaaS: BL01 MIRAS
First case of DaaS at ALBA offering remote access to re/process acquired data using Windows Virtual Desktops (Citrix XenDesktop VDI)
- Remotelly available
- Application licenses per country embargo restrictions managed by groups in VMs
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
DaaS: BL01 MIRAS
Xen Desktop VDI technology advantages for remote DaaS:
- Single GPU partitioning for several VM
- Compressed traffic (Default HDX 3D graphics encoder H.264)
- Low latency, adaptative encoding, even over Mobile networks
- Netscaler firewalled access portal: https://beamlinesvdi.cells.es
- Site licensing: Per 78 maximum concurrent users (XenApp & XenDesktop)
Next steps:
- Adding Linux Xen Dekstop VDI Machines – technology supported 3y ago
- Evaluate possible connection to Calipso+ access portal
- Compare Xen Desktop performance vs alternatives like Guacamole for remote
access Other VDI technology use case:
- Windows CAD design VMs with shared GPUs Nvidia K2
- Thin Clients with local GPU HDX 3D decoding (Dell Wyse 3040, < 300 EUR)
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
Kubernetes containers pilot
Pilot use cases:
- Controls software Continuous
Integration
- Hosting ELK KPI dashboards
portal Next steps:
- Evaluating design of future
production environment:
- Securized: Multi-tenant
- HA strategies
- Proper VLAN
execution
- Repositories
- ...
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
Configuration Management w/ “Saltstack”
- "Saltstack” recipes as main CM tool, to store
and distribute software configs. ( recipes )
– In use:
- Installing machines with Cobbler via PXE
- Configuring all Debian CR/BLs linux workstations
- Configuring CentOS HPC cluster nodes & CLAs
git pull; salt -E 'cla*|slurm*|hpcnode0*' -b 1000 state.highstate saltenv=staging
– Being introduced progressively on new Control & infrastructure servers
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
Géant Cloud Framework
- ALBA: Framework contract under last Legal review steps
with AWS through Sparkle:
– Remote archive on Glacier as 1st foreseen pilot project
- Géant Cloud Framework: https://clouds.geant.org/
– Already operative & joinable by Géant network members. – Framwork ends 31 December 2020 – New edition on preparation – Géant Edugain authentication through Géant National ORGs – Each country represented by Géant National ORGs (Rediris in Spain) – Country & EU data regulations & restrictions already faced – Independent contracts to sign with a company awarded in a country and the selected cloud service (Azure, AWS, other proprietary)
- Each company offers diferent control tools for groups & limits
- Monthly invoicing with anual order límits
- Different hosting discounts among reselling companies and Cloud providers
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017
VMs Orchestrators
- Existing “Inhouse cloud” based on Citrix Xen
Desktop and Xen App
- Still no relevant experience at ALBA with
- ther VMs orchestrators like OpenStack,
Open Nebula,…
– Waiting to sync with Intl’ DaaS collaborations blueprints and common architecture use cases
Antoni Pérez ALBA DA / DaaS infrastructue 23/05/2017