~Alamos LA-UR- Approved for public release; distribution is - - PDF document

alamos
SMART_READER_LITE
LIVE PREVIEW

~Alamos LA-UR- Approved for public release; distribution is - - PDF document

~Alamos LA-UR- Approved for public release; distribution is unlimited. Boo ting Over Infiniband With Perc e us Cluster Management Title. Matt he w Do s anjh , INST-OFF Author(s). William Pi cke tt , IINST-OFF Gr ah am Va n He ule , INST-OFF Ac


slide-1
SLIDE 1

LA-UR-

Approved for public release; distribution is unlimited.

Title. Author(s). Intended for:

Booting Over Infiniband With Perceus Cluster Management Matthew Dosanjh, INST-OFF William Pickett, IINST-OFF Graham Van Heule, INST-OFF Academic Distribution

~Alamos

NATIO NA L LA BORATORY

  • EST 1943 ----

Los Alamos National Laboratory, an aHirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the US Department of Energy under contract DE-AC52-06NA25396. By acceptance

  • f this article. the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the

published form of this contribution, or to allow others to do so, for U.S. Government purposes. '-os Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher's right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness Form 836 (7/06)

slide-2
SLIDE 2

Abstracts

Booting Over Infmiband With Perceus Cluster Management

MaUhev,r Dosanjh, UNM William Pickett, NMT Graham Van Heule, MTU

Abstract: Two main network fabrics are used in large diskless HP clusters: Ethernet is typically

used for cl uster management tasks such as booting and IB is typically llsed fodast data

  • communication. Configuring a cl uster of diskless nodes to boot over IB fa bric using Perceus

could help el iminate the need for themet in cl usters, reducing costs and reducing the number of part· . The motivation behind this project is a situation currently facing the Coyote super

  • computer. It is wired exclusively with IB and uses a two-stage boot processes; it loads a small

kernel fro m flash memory and proceeds to download the rest through lB. Those who manage the cluster would prefer to move away from flash memory, leaving only two viable options: purchase and install an expensive Ethernet network, or confi gure the computers to ful ly boot over lB . To configure the network to boot over IB the IB cards must be upgraded to lise the gPXE protocol, a ter which the cluster management software must be configured to recognize and work with the [B cards allowing for a diskless boot. The potential implications include the evaluation

  • f scalability in a large cluster, such as Coyote. As IB has higher bandwidth than Ethern t,

clusters would gai n more computing time by decreasing boot time. This als leads to potential research of multicast booting over lB.

slide-3
SLIDE 3

Booting Over Infiniband With Perceus Cluster Management PRESENTED BY Matthew Dosanjh - UNM William Pickett - NMT Graham Van Heule - MTU On 8/3/2009

LoS Alamos

NATIONA L LAB ORATO RY

UNCLASSIFIED

Slide 1

  • --- fST.'

9.)

Operated by Los Alamos National Security, LLC for NNSA

slide-4
SLIDE 4

Outline

  • Motivation

Goals

  • What We Did

Issues Faced

  • Future Research
  • Conclusions

~ Los Alamos

NATIONA L LAB ORATO RY

UNCLASSIFIED

Slide 2

___ fIT' U] ------

Operated by Los Alamos National Security, LLC for NNSA

slide-5
SLIDE 5

Motivation

  • Coyote
  • Has no Ethernet network
  • Uses two stage boot
  • Stage 1 is a small kernel loaded from local flash memory
  • Stage 2 is downloaded by stage one over Infiniband
  • Local flash memory will eventually deteriorate
  • There exist two solutions
  • Purchase and install an expensive Ethernet network
  • Configure the cluster to grab the stage one image over Infiniband.

t;

Los Alamos

NATI O NA L LAB OR ATO RY

UNCLASSIFIED

Slide 3

  • ES T." .. l

Operated by Los Alamos National Security, LLC for NNSA

. "J~

VA.

't

slide-6
SLIDE 6

Our Project

  • Our goal is to get this cluster to boot over Infiniband to determine if it is

feasible to do it to a larger cluster in a production environment

  • Perceus - cluster management software
  • DHCP - Dynamic Host Configuration Protocal
  • Infiniband - High bandwidth, low latency network fabric

p,

Los Alamos

UNCLASSIFIED

NATIONA L LA BORATORY

Slide 4 ____ CST "«3 Operated by Los Alamos National Security, LLC for NNSA

slide-7
SLIDE 7

Outline

Motivation

  • Goals
  • What We Did
  • Issues Faced
  • Future Research

Conclusions

  • t;

LosAlamos

NATION A L LABORATORY

UN C LAS S I FIE D

Slide 5

___ ES T. '9 1111 -------------------------------------------------------

Operated by Los Alamos National Security. LLC for NNSA

[')'4}

  • - ~
slide-8
SLIDE 8

Steps On The Road To Completion

  • Created Perceus VNFS image with Infiniband drivers
  • Burned gPXE into Infiniband card firmware
  • Added Infiniband drivers to stage 1 image
  • Patched DHCP to recognize the 32 digit MAC address of Infiniband
  • Patched Perceus to accept Infiniband MAC addresses

Stage 1

VNFS

gPXE Image Image

LoS Alamos

NATIONA L LA80RAT ORY

UNCLASSIFIED

Slide 6

  • ___ H T 1 943

Operated by Los Alamos National Security, LLC for NNSA

  • .wr-rY!fI

V ..."~

slide-9
SLIDE 9

Issues Encountered

DHCP doesn't have support for Infiniband at it's current version

  • When patched for Infiniband DHCP doesn't send the correct MAC address

Ethernet MAC: 00:01 :02:03:04:05 Infiniband MAC: 00:01 :02:03:04:05:06:07:08:09: 10:

11 :

12: 13: 14: 15: 16: 17: 18: 19:20

  • The default initramfs doesn't contain Infiniband drivers
  • Kernel is not by default configured to handle Infiniband
  • Large lack of documentation for Perceus' Infiniband capabilities

p,

Los Alamos

UNCLASSIFIED

NAT ION A L LAB ORATOR Y

Slide 7 ____ £!IT 1'4 3 Operated by Los Alamos National Security, LLC for NNSA

slide-10
SLIDE 10

Outline

Motivation

  • Goals
  • What We Did
  • Issues Faced
  • Future Research
  • Conclusions
  • ~Alamos

NATIONAL LABORAT ORY

UN C LAS S I FIE D

Slide 8

___

  • rST. 19.3 _______________________________________________________

Operated by Los Alamos National Security. LLC for NNSA

....

_~D!fl

slide-11
SLIDE 11

Ideas For Future Research

  • Multicast boot over Infiniband may be a quick and efficient solution for a larger

cluster

  • Using iSCSI rather than NFS when booting over Infiniband
  • Bottleneck research
  • Doing quantitative comparison of the boot speed of Ethernet and Infiniband

~)

Los Alamos

NATION AL LA 80 RATOR Y

UNCLASSIFIED

Slide 9 ____

  • EST. 194)

Operated by Los Alamos National Security, LLC for NNSA

slide-12
SLIDE 12

Conclusions

  • We have successfully booted over Infiniband
  • However we still have issues getting a unique hardware identifier
  • It currently can only boot one node.
  • Further research would be required for large scale deployment

(;

Los Alamos

N ATIO NA L LA BO RATORY

UNCLASSIFIED

Slide 10 ____ £ST.'9.) Operated by Los Alamos National Security, LLC for NNSA

slide-13
SLIDE 13

Questions

~ Los Alamos

NATIONA L LABORATORY

UN C LAS S I FIE 0

____ ESY 19 U _______________________________________________________________________________________________________ Slide 11 Operated by Los Alamos National Security, LLC for NNSA _

  • oW /JI/!'!:'rY4l
  • - Vii.

"f1i::JlJiE4