Build Your Cluster with Rocks Build Your Cluster with Rocks
Yu Fu Yu Fu University of Florida University of Florida
2011 OSG Summer Workshop 2011 OSG Summer Workshop Lubbock, TX Lubbock, TX
Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu - - PowerPoint PPT Presentation
Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu Yu Fu University of Florida University of Florida 2011 OSG Summer Workshop 2011 OSG Summer Workshop Lubbock, TX Lubbock, TX You need a cluster You need a cluster
Yu Fu Yu Fu University of Florida University of Florida
2011 OSG Summer Workshop 2011 OSG Summer Workshop Lubbock, TX Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Clusters are cost/performance oriented computational engines, but are hard to computational engines, but are hard to manage. manage.
Cluster management gets linearly harder as it scales out and as more and more as it scales out and as more and more frequent updates come for modern OS. frequent updates come for modern OS.
Heterogeneous nodes are a bummer.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rocks Cluster Distribution is a Linux Linux distribution distribution intended for intended for high high-
performance computing clusters. computing clusters.
Started by National Partnership for Advanced Computational Infrastructure and the SCSD Computational Infrastructure and the SCSD in 2000. in 2000.
Used by 1800+ registered clusters so far and many more unregistered clusters. many more unregistered clusters.
The biggest registered academic Rocks cluster as of now: 8632 CPUs. cluster as of now: 8632 CPUs.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Based on CentOS CentOS Linux. Linux.
Uses standard RPM as package tool.
Uses automatically generated custom RedHat RedHat kickstart kickstart to control the whole installation. to control the whole installation.
Is NOT NOT “
“system imager
system imager”
” based.
based.
All nodes are “
“installed, not
installed, not “
“imaged
imaged”
”.
.
Managed with MySQL MySQL and XML. and XML.
Highly customizable.
Supports heterogeneous nodes, even cross-
install a hybrid of i386 and x86_64.
Excellent scalability with “
“Avalanche
Avalanche”
” install
install mode based on mode based on BitTorrent BitTorrent p2p transfer tool. p2p transfer tool.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
A bunch of x86/x86_64 machines –
– you can
you can even build a Rocks cluster on virtual machines. even build a Rocks cluster on virtual machines.
1GB memory on each machine for Rocks 5.4.
30GB hard drive for Rocks 5.4 default install.
Two Ethernet ports on one of them –
– the
the frontend / frontend / headnode headnode. .
PXE booting feature on worker nodes –
– can use
can use CD or floppy emulators. CD or floppy emulators.
A network switch to connect them up.
Download Rocks ISO images and burn to CDs/DVDs CDs/DVDs –
– a single
a single “
“Jumbo
Jumbo”
” DVD would meet
DVD would meet needs in most startup cases, you can add more needs in most startup cases, you can add more rolls later. rolls later.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rocks provides for installation of a cluster with minimum user interaction. with minimum user interaction.
Steps involved
– – Boot from CD or DVD on the frontend. Boot from CD or DVD on the frontend. – – Answer a few questions. Answer a few questions. – – Get favorite beverage Get favorite beverage …
…
– – You You’
’re on your way to a full fledged cluster in
re on your way to a full fledged cluster in 2 hours or less: ~ 1 hour for the frontend 2 hours or less: ~ 1 hour for the frontend server install, ~ 30 min for worker node, ~ 20 server install, ~ 30 min for worker node, ~ 20 extra sec for each additional worker node. extra sec for each additional worker node.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Minimum Rolls for a frontend: base, kernel, Minimum Rolls for a frontend: base, kernel, os
, web-
server
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Default Rocks 5.4 frontend server disk partitions: Default Rocks 5.4 frontend server disk partitions:
remainder of the primary disk remainder of the primary disk /export (symbolic link to /state/partition1) /export (symbolic link to /state/partition1) 1 GB 1 GB swap swap 4 GB 4 GB / / var var 16 GB 16 GB / /
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Manual partitioning: Manual partitioning: Minimum of 16GB for / partition. Minimum of 16GB for / partition. Must have a /export partition. Must have a /export partition. Software Software RAIDs RAIDs are supported, but LVM is not supported as in 5.4. are supported, but LVM is not supported as in 5.4.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Run “
“insert
insert-
ethers”
” on frontend:
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Boot worker node from PXE.
insert-
ether will automatically detect it, and start install on it.
Worker nodes will be named as compute-
< rack # > -
< slot # > by default. by default.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
From this point, this point, “ “rocks rocks-
console” ” can display can display node node’ ’s s screen. screen.
Based on VNC, no KVM switch is needed!
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Once a worker nodes is recognized by insert-
ethers, you can go to the next node. you can go to the next node.
Repeat the above procedure, until a whole rack is finished. finished.
Then do the next rack:
– – Stop insert Stop insert-
ethers by F8. – – Re Re-
launch insert-
ether with the rack number:
insert insert-
ethers --
rack 1
– – Boot up worker nodes in this rack one by one from PXE Boot up worker nodes in this rack one by one from PXE until the last node. until the last node.
Repeat until the last rack.
Your cluster is built!
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rocks kickstart kickstart is generated on is generated on-
the-
fly, controlled by the graph XML files: the graph XML files:
/export/rocks/install/site /export/rocks/install/site-
profiles/5.4/nodes
Customization can be done in user-
defined XMLs XMLs. .
– – Partitions Partitions – – Packages Packages – – Extra files Extra files – – Pre Pre-
install scripts – – Post Post-
install scripts
Check kickstart kickstart before install: before install:
/export/rocks/install/ /export/rocks/install/sbin/kickstart.cgi sbin/kickstart.cgi – –c c node_name node_name
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
An Example of Custom XML: An Example of Custom XML:
<?xml version="1.0" standalone="no"?> <?xml version="1.0" standalone="no"?> < <kickstart kickstart> > <description> Enable SSH </description> <description> Enable SSH </description> <main> <! <main> <!--
kickstart 'main' commands go here 'main' commands go here --
> </main> <pre> <! <pre> <!--
partitioning commands go here --
> </pre> <package> <package> ssh ssh </package> </package> <package> <package> ssh ssh-
clients </package> <package> <package> ssh ssh-
server </package> <package> <package> ssh ssh-
askpass </package> </package> <post> <post> cat & cat > gt; /etc/ ; /etc/ssh/ssh_config ssh/ssh_config & << lt;< 'EOF' ; 'EOF' Host * Host * CheckHostIP CheckHostIP no no ForwardX11 yes ForwardX11 yes ForwardAgent ForwardAgent yes yes StrictHostKeyChecking StrictHostKeyChecking no no UsePrivilegedPort UsePrivilegedPort no no FallBackToRsh FallBackToRsh no no Protocol 1,2 Protocol 1,2 EOF </post> EOF </post> </ </kickstart kickstart> >
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
A single drive example:
<pre> <pre> echo echo “ “clearpart clearpart --
all --
initlabel --
drives=sda sda part / part / --
size 8000 --
sda part swap part swap --
size 1000 --
sda part /data part /data --
size 0 --
grow --
sda” ” & > gt; ; / /tmp/user_partition_info tmp/user_partition_info </pre> </pre>
Software RAID
<pre> <pre> echo echo “ “clearpart clearpart --
all --
initlabel --
drives=sda,sdb sda,sdb part / part / --
size 8000 --
sda part swap part swap --
size 1000 --
sda part raid.00 part raid.00 --
size=20000 --
sda part raid.01 part raid.01 --
size=20000 --
sdb raid /data raid /data --
level=1 --
device=md0 raid.00 raid.01” ” & > gt; ; /tmp/user_partition_info /tmp/user_partition_info </pre> </pre>
Manual partitioning
echo echo “ “rocks manual rocks manual” ” & > gt; / ; /tmp/user_partition_info tmp/user_partition_info
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
All Rocks packages are installed and managed through RPM. through RPM.
– – Prepare the RPM Prepare the RPM – – Put it in Put it in /export/rocks/install/contrib/5.4/
/export/rocks/install/contrib/5.4/arch arch/RPMS /RPMS
– – Add it in the XML: Add it in the XML:
<package> <package> your_package your_package </package> </package>
– – Re Re-
create the Rocks distro distro: :
# # cd cd /export/rocks/install /export/rocks/install # rocks create # rocks create distro distro
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
The same method can also be used to update a software in the Rocks. Rocks will automatically software in the Rocks. Rocks will automatically pickup and install the newer version on worker pickup and install the newer version on worker nodes. nodes.
Get the RPM of the new version, e.g.:
wget wget ftp:// ftp://ftp.scientificlinux.org/linux/scientific ftp.scientificlinux.org/linux/scientific /5x/x86_64/SL/kernel /5x/x86_64/SL/kernel-
2.6.18-
238.9.1.el5.x86_64.rpm
Put it in /export/rocks/install/contrib/5.4/
/export/rocks/install/contrib/5.4/arch arch/RPMS /RPMS
/export/rocks/install/rolls/os/5.4/arch arch/RedHat/RPMS /RedHat/RPMS
Re-
create the Rocks distro distro: :
# # cd cd /export/rocks/install /export/rocks/install # rocks create # rocks create distro distro
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
To share files/software not available in RPM, To share files/software not available in RPM, you can put them in Rocks system NFS: you can put them in Rocks system NFS:
– – On the On the frontend frontend, go to the directory , go to the directory /share/apps. /share/apps. – – Add the files you'd like to share within this Add the files you'd like to share within this directory. directory. – – All files there will be available on the compute All files there will be available on the compute nodes under: /share/apps nodes under: /share/apps
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Post-
install custom config config scripts can be put in the < post> scripts can be put in the < post> < /post> section of the < /post> section of the XMLs XMLs. .
Just like regular Shell scripts, but be careful with XML reserved characters, e.g. replace reserved characters, e.g. replace “
“>
> ”
” with
with “
“&
> gt; ;”
” etc.
etc.
<post> <post> cat & cat > gt; /etc/ ; /etc/ssh/ssh_config ssh/ssh_config & << lt;< 'EOF' ; 'EOF' Host * Host * CheckHostIP CheckHostIP no no ForwardX11 yes ForwardX11 yes ForwardAgent ForwardAgent yes yes StrictHostKeyChecking StrictHostKeyChecking no no UsePrivilegedPort UsePrivilegedPort no no FallBackToRsh FallBackToRsh no no Protocol 1,2 Protocol 1,2 EOF EOF </post> </post>
Other scripts such as Python are also possible.
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
MySQL DB
makehosts
/etc/hosts
makedhcp
/etc/dhcpd.conf
insert-ethers
Node 0 Node 1 Node N
Automated node discovery
Kickstart.cgi
XMLs
kickstart
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Older versions of Rocks had a convenient web Older versions of Rocks had a convenient web-
based tool to manage its MySQL MySQL
related tasks and no longer encourages user to directly access its tasks and no longer encourages user to directly access its MySQL MySQL. .
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rolls are Rocks Modules.
It is the mechanism for delivery of packages and configuration in Rocks. configuration in Rocks.
What is inside a roll?
– – Packaged binaries: Packaged binaries: RPMs RPMs – – Automatic configuration data Automatic configuration data – – Installation map Installation map – – A roll is basically a collection of A roll is basically a collection of RPMs RPMs plus Kickstart graph plus Kickstart graph XMLs XMLs packed in the ISO form. packed in the ISO form.
There are tons of rolls readily available: ganglia, hpc hpc, bio, , bio, sge sge, torque, condor, , torque, condor, xen xen ……
……
You can build your own roll!
As OSG releases are moving to RPM, we may have an OSG roll in the future! roll in the future!
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
The 411 Secure Information Service Information Service provides NIS provides NIS-
like functionality for Rocks functionality for Rocks clusters with better clusters with better security and scalability. security and scalability.
It is basically a file sync mechanism with listener mechanism with listener and poll agents. and poll agents.
Unlike NIS, changes must be made on be made on frontend frontend. .
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rocks is able to heterogeneous heterogeneous nodes nodes even with different architectures. even with different architectures.
Rocks can install i386 nodes from an x86_64 x86_64 frontend frontend. .
Rocks can install x86_64 nodes from an i386 i386 frontend frontend. .
It is quite simple to do so in Rocks 5.4, see details in User see details in User’
’s Guide
s Guide http://www.rocksclusters.org/roll http://www.rocksclusters.org/roll-
documentation/base/5.4/cross.html
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Very simple!
1.
Create a backup roll of current custom version: custom version:
# # cd cd /export/site /export/site-
roll/rocks/src src/roll/restore /roll/restore # make roll # make roll
2.
Install a clean new version. 3.
Include the backup roll at the beginning of the install. beginning of the install. 4.
Done!
It is also a handy way to back up the back up the frontend frontend. .
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Rocks User’
’s Guide:
s Guide: http://www.rocksclusters.org/roll http://www.rocksclusters.org/roll-
documentation/base/5.4/
Rocks Wiki Wiki: : https://wiki.rocksclusters.org https://wiki.rocksclusters.org
Rocks User’
’s Mailing List:
s Mailing List: https:// https:// lists.sdsc.edu/mailman/listinfo/npa lists.sdsc.edu/mailman/listinfo/npa ci ci-
rocks-
discussion
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX
Build Your Cluster with Rocks Yu Fu Build Your Cluster with Rocks Yu Fu 2011 OSG Summer Workshop, Lubbock, TX 2011 OSG Summer Workshop, Lubbock, TX