13 Mar 2013 John Hover
1
Large-scale Cloud-based clusters using Boxgrinder, Condor, Panda, - - PowerPoint PPT Presentation
Large-scale Cloud-based clusters using Boxgrinder, Condor, Panda, and APF John Hover OSG All-Hands Meeting 2013 Indianapolis, Indiana John Hover 13 Mar 2013 1 Outline Rationale In general... OSG-specific Dependencies/Limitations
1
2
3
4
5
6
7
name: sl5-x86_64-base
name: sl version: 5 hardware: partitions: "/": size: 5 packages:
repos:
baseurl: “http://host/path/repo” files: "/root/.ssh":
"/etc":
post: base:
8
name: sl5-x86_64-batch appliances:
packages:
repos:
baseurl: "http://research.cs.wisc.edu/htcondor/yum/stable/rhel5" files: "/etc":
post: base:
9
name: sl5-x86_64-wn-osg summary: OSG worker node client. appliances:
packages:
repos:
baseurl: "http://dev.racf.bnl.gov/yum/snapshots/rhel5/osg-release- 2012-07-10/x86_64"
baseurl: "http://dev.racf.bnl.gov/yum/grid/osg-epel- deps/rhel/5Client/x86_64" files: "/etc":
post: base:
10
11
#~.boxgrinder/config plugins:
username: jhover password: XXXXXXXXX tenant: bnlcloud host: cldext03.usatlas.bnl.gov port: 9292 s3: access_key: AKIAJRDFC4GBBZY72XHA secret_access_key: XXXXXXXXXXX bucket: racf-cloud-1 account_number: 4159-7441-3739 region: us-east-1 snapshot: false
12
13
14 #/etc/apf/queues.conf [BNL_CLOUD] wmsstatusplugin = Panda wmsqueue = BNL_CLOUD batchstatusplugin = Condor batchsubmitplugin = CondorLocal schedplugin = Activated sched.activated.max_pilots_per_cycle = 80 sched.activated.max_pilots_pending = 100 batchsubmit.condorlocal.proxy = atlas-production batchsubmit.condorlocal.executable = /usr/libexec/wrapper.sh [BNL_CLOUD-ec2-spot] wmsstatusplugin = CondorLocal wmsqueue = BNL_CLOUD batchstatusplugin = CondorEC2 batchsubmitplugin = CondorEC2 schedplugin = Ready,MaxPerCycle,MaxToRun sched.maxpercycle.maximum = 100 sched.maxtorun.maximum = 5000 batchsubmit.condorec2.gridresource = https://ec2.amazonaws.com/ batchsubmit.condorec2.ami_id = ami-7a21bd13 batchsubmit.condorec2.instance_type = m1.xlarge batchsubmit.condorec2.spot_price = 0.156 batchsubmit.condorec2.access_key_id = /home/apf/ec2-racf-cloud/access.key batchsubmit.condorec2.secret_access_key = /home/apf/ec2-racf- cloud/secret.key
15
16
17
18
19
20
21
22
Type Memory VCores “CUs” CU/Core $Spot/hr Typical $On- Demand/hr Slots? m1.small 1.7G 1 1 1 .007 .06
3.75G 1 2 2 .013 .12 1 m1.large 7.5G 2 4 2 .026 .24 3 m1.xlarge 15G 4 8 2 .052 .48 7
23
24
25
26
27
28
29
30
31
32