PROOF as a Service on the Cloud a Virtual Analysis Facility based on - - PowerPoint PPT Presentation
PROOF as a Service on the Cloud a Virtual Analysis Facility based on - - PowerPoint PPT Presentation
PROOF as a Service on the Cloud a Virtual Analysis Facility based on the CernVM ecosystem Dario Berzano R.Meusel, G.Lestaris, I.Charalampidis, G.Ganis, P .Buncic, J.Blomer CERN PH-SFT CHEP2013 - Amsterdam, 15.10.2013 A cloud-aware analysis
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
A cloud-aware analysis facility
2
geographically distributed independent cloud providers
IaaS SaaS
user’s workflow does not change admins provide virtual clusters
Virtual Analysis Facility → analysis cluster on the cloud in one click
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Clouds can be a troubled environment
- Resources are diverse
→ Like the Grid but at virtual machine level
- Virtual machines are volatile
→ Might appear and disappear without notice Building a cloud aware application for HEP
- Scale promptly when resources vary
→ No prior pinning of data to process to the workers
- Deal smoothly with crashes
→ Automatic failover and clear recovery procedures
A cloud-aware analysis facility
3
Usual Grid workflow → static job pre-splitting ≠ cloud-aware
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
PROOF is cloud-aware
PROOF: the Parallel ROOT Facility
- Based on unique advanced features of ROOT
- Event-based parallelism
- Automatic merging and display of results
- Runs on batch systems and Grid with PROOF on Demand
PROOF is interactive
- Constant control and feedback of attached resources
- Data is not preassigned to the workers → pull scheduler
- New workers dynamically attached to a running process
4
Interactivity is what makes PROOF cloud-aware
NEW
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
PROOF is cloud-aware
- Zero configuration
No system-wide installation
- Sandboxing
User crashes don’t propagate to others
- Self-servicing
User can restart her PROOF server
- Advanced scheduling
Leverage policies of underlying WMS
5
PROOF on Demand: runs PROOF on top of batch systems
pod.gsi.de
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Adaptive workload: very granular (up to per event) pull architecture
PROOF is cloud-aware
6
get next
master worker packet generator ready process
packet
process
packet get next
ready process
packet get next
ready
time
Worker
0.83 0.77 0.5 0.23 0.17 0.33 0.47 0.27 0.37 0.67 0.0 0.73 0.57 0.53 0.13 0.43 0.63
Packets 500 1000 1500 2000 2500 3000 3500 4000
Packets per worker
Mean 2287 RMS 16.61
Query Processing Time (s) 2260 2270 2280 2290 2300 2310 2320 2330 2 4 6 8 10 12 14 16
Mean 2287 RMS 16.61
Worker activity stop (seconds)
nonuniform workload distribution uniform completion time all workers are done in ~20 s
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
NEW IN ROOT v5.34.10
Dynamic addition of workers new workers can join and offload a running process
PROOF is cloud-aware
7
init
master process
time
worker worker worker worker init
process
init
process
worker worker
initially available bulk init new workers autoregister deferred init init register init register init
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
PROOF dynamic workers
8
User requests N workers
Minimal latency and optimal resources usage See ATLAS use case here: http://chep2013.org/contrib/256
A bunch of workers is started Other workers will gradually become available Old Workflow Wait until “some” workers are ready Run the full analysis on such workers only They will be available at next run only Wait until at least one worker becomes available Run the analysis Additional workers join the processing New Workflow
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Time [s] 500 1000 1500 2000 2500 3000 3500
- Num. available workers
20 40 60 80 100
CERN CNAF ROMA1 NAPOLI MILANO
PROOF dynamic workers
9
See ATLAS talk: http://chep2013.org/contrib/256 Various ATLAS Grid sites considered Measured time taken for 100 Grid jobs requested at the same time to start
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Total required computing time [s] 5000 10000 15000 20000 25000 30000 35000 Actual time to results [s] 200 400 600 800 1000 1200
Grid batch jobs (ideal num. of workers) PROOF pull and dynamic workers
PROOF dynamic workers
10
Analytically derived from actual startup latency measurements PROOF is up 30% more efficient
- n the same computing resources by
design (analytical upper limit) PROOF with Dynamic Workers: all job time spent in computing (never idle, no latencies) Batch jobs: results collected only when late workers are finished (latencies and dead times)
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
The virtual analysis facility
- What: a cluster of µνCernVMs with HTCondor
→ One head node plus a scalable number of workers
- How: contextualization configured on the Web
→ Simple web interface: http://cernvm-online.cern.ch
- Who: so easy that can even be created by end users
→ You can have your personal analysis facility
- When: scales up/down automatically
→ Optimal usage of resources: fundamental when you pay for them!
11
PROOF Elastiq PoD µνCernVM authn/authz HTCondor CVM online CernVM-FS
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
The virtual analysis facility
12
VAF leverages the CernVM ecosystem and HTCondor
- µCernVM: SLC6 compatible OS on demand
→ See previous talk: http://chep2013.org/contrib/213
- CernVM-FS: HTTP-based cached FUSE filesystem
→ Both OS and experiments software downloaded on demand
- CernVM Online: safe context GUI and repository
→ See previous talk: http://chep2013.org/contrib/185
- HTCondor: light and stable workload management system
→ Workers auto-register to head node: no static resources configuration
The full stack of components is cloud-aware
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Elastiq queue monitor
13
HTCondor queue
waiting running running running waiting
Running VMs
idle working idle idle working
start new VMs shutdown idle VMs
cloud controller
- r CernVM Cloud
Python app to monitor HTCondor and scale up or down
Experimental meta cloud controller
- Accepts scale requests
- Translates them to multiple clouds
Jobs waiting too long will trigger a scale up EC2 interface (credentials given securely in the context) Code available at http://bit.ly/elastiq Can be used on any HTCondor cluster and has a trivial configuration
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Elastic cloud computing in action
14
Context creation with CernVM Online: http://cernvm-online.cern.ch
Create new special context Customize a few options Get generated user-data
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Elastic cloud computing in action
15
Screencast: http://youtu.be/fRq9CNXMcdI
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
µνCernVM+PROOF startup latency
16
Measured the delay before requested resources become available
Target clouds:
- Small: OpenNebula @ INFN Torino
- Large: OpenStack @ CERN (Agile)
Test conditions:
- µνCernVM use a HTTP caching proxy
→ Precaching via a dummy boot
- µνCernVM image is 12 MB big
→ Image transfer time negligible
- VMs deployed when resources are available
→ Rule out delay and errors due to lack of resources
Measuring latency due to:
- µνCernVM boot time
- HTCondor automatic
registration of new nodes
- PoD and PROOF reaction time
Note: not comparing cloud
- infrastructures. Only measuring
µCernVM+PROOF latencies.
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 CERN OpenStack Torino OpenNebula Time to wait for workers [m:ss]
µνCernVM+PROOF startup latency
17
Measured time elapsed between PoD workers’ request and availability: pod-info -l 10 VMs started in the test Compatible results: latency is ~6 minutes from scratch
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
Conclusions
Every VAF layer is cloud-aware
- PROOF+HTCondor deal with “elastic” addition/removal of workers
- µCernVM is very small and fast to deploy
- CernVM-FS downloads only what is needed
Consistent configuration of solid and independent components
- No login to configure: all done via CernVM Online context
- PROOF+PoD also work dynamically on the Grid
- Elastiq can scale any HTCondor cluster, not PROOF-specific
- Reused existing components wherever possible
18
Thank you for your attention!
Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308
References
20
- PROOF (the Parallel ROOT Framework)
http://root.cern.ch/drupal/content/proof
- Virtual Analysis Facility client and Elastiq
https://github.com/dberzano/virtual-analysis-facility
- The CernVM Ecosystem
http://cernvm.cern.ch/portal/publications
- Cloud @ INFN Torino
http://chep2013.org/contrib/474
- CERN Agile Infrastructure