Jenkins + CVMS : Distributed Development, Centralised Delivery - - PowerPoint PPT Presentation

jenkins cvms
SMART_READER_LITE
LIVE PREVIEW

Jenkins + CVMS : Distributed Development, Centralised Delivery - - PowerPoint PPT Presentation

Jenkins + CVMS : Distributed Development, Centralised Delivery Bruce Becker | bbecker@csir.co.za Coordinator: SAGrid SANREN, Meraka Institute, CSIR Outline What users want SAGrid VO a catch-all VO with many applications Problem


slide-1
SLIDE 1

Jenkins + CVMS :

Distributed Development, Centralised Delivery

Bruce Becker | bbecker@csir.co.za Coordinator: SAGrid SANREN, Meraka Institute, CSIR

slide-2
SLIDE 2

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Outline

  • What users want
  • SAGrid VO – a catch-all VO with many applications
  • Problem statements:
  • Problem 1: ”the usual problem” – maintaining

applications in a distributed computing environment

  • Problem 2: ”Another usual problem” - maintaining a

complex application inventory

  • General solution : CVMFS + Jenkins
  • Some specifics of SAGrid CI platform
  • Outlook
slide-3
SLIDE 3

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

SAGrid as a catch-all VO

  • The South African National Grid operates a

catch-all VO which all South African researchers can use to access computing and data resources.

  • SAGrid VO is not a domain-specific VO, so
  • several widely-varying uses for the applications

supported by this VO

  • Applications requested by users or communities

themselves

slide-4
SLIDE 4

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

What users want

Amazing infrastructure Some users want highly varied, modular application selection Vertically integrated Highly specialised applications Highly trained support Highly trained support

slide-5
SLIDE 5

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

What users get sometimes

slide-6
SLIDE 6

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

The problem (1) - ”the usual problem”

  • Software distribution was done mostly by hand”:
  • Someone from the ops team develops script to install the application
  • Apps installed via job submission
  • Tags applied via script or by the job itself
  • Issues:
  • Major overhead of work
  • Inconsistent installation procedures between applications and sites
  • Bottleneck in porting applications (has to be done by someone in the

VO)

  • Duplication of effort, especially in dependencies of applications
  • Difficult to manage application lifecycles
slide-7
SLIDE 7

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

The problem (2) - what about the community ?

  • Managing the inventory in a catch-all VO can be complex

when there are many applications

  • Prioritising porting requests depends on the knowledge
  • f the export porting the application
  • Can lead to major delays in porting and deploying applications
  • However, a user or community usually has an expert who

knows how to tune, port and configure the application properly, as well as dependencies

  • Usually, ”they” have to conform to ”us” - learn grid tools and

terminology, etc

slide-8
SLIDE 8

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Problem (3) : Changes to the playing fjeld

  • New middleware stacks
  • New architectures – GPGPU, ARM
slide-9
SLIDE 9

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Questions to answer

  • How do we lower the barrier to entry to the grid or

cloud infrastructure ?

  • How can the application expert prove to the resource

provider that the application will actually run on the execution environment of the site ?

  • How can we manage the lifecycle of applications

across multiple versions, architectures, configurations ?

  • How can we ensure that once applications are

”certified”, they are actually available on as many sites as possible ?

slide-10
SLIDE 10

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

General Solution: Jenkins + CVMFS

  • The issues outlined are ”typical” in a large

software project

  • Usually solved by judicious use of Continuous

Integration system

  • Once applications have been ”ported”, put them

into a trusted repository

  • Previously – built RPMs, but required site-

admin intervention

  • One-time configuration with CVMFS
slide-11
SLIDE 11

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

First, some changes

  • Distribute the effort, centralise the tools
  • Move repository from ”closed” SVN repo

– https://ops.sagrid.ac.za/trac/svn/repo

  • to git

– https://github.com/SAGridOps/SoftwareInstallation

  • Don't have to give write access to a single repo, instead

accept pull requests

  • Take advantage of all the Github infrastructure
  • Expand possible contributors to those ”outside” the

infrastructure

  • Recognise individuals' contribution
slide-12
SLIDE 12

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Recognise individuals...

slide-13
SLIDE 13

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Decentralise the team

slide-14
SLIDE 14

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Collaborate with code

slide-15
SLIDE 15

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Let the robots do the work

  • Define what we want to deploy – let the experts

take care of how to deploy

  • DevOps paradigm – same review/tag/release

mechanisms on operations code as we have for scientific applications

  • Teach a marketable skill
  • Allow specialisation
  • Enable remote management of complex services
  • Ensure that published methodology is adopted

methodology

slide-16
SLIDE 16

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Quality Control and feedback

  • Ensure that

requested applications are included in the repo

  • Provide testing and

QA infrastructure

  • Self-serve to users
slide-17
SLIDE 17

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

The CI environment

  • Jenkins is extremely flexible... can do almost anything
  • AuthN/AuthZ
  • Currently using Github Oauth
  • Take advantage of future Identity Federation
  • We wanted to simulate different execution

environments

  • Already in production
  • Planned for future
  • Track and re-use depedendencies
slide-18
SLIDE 18

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Matrix-based builds

  • Independent different builds and build statuses for

different configurations:

  • Application name
  • Version
  • OS
  • Architecture
  • … can add specific tuning configurations...
  • We can see exactly what's broken where – build

more resilient integration code.

slide-19
SLIDE 19

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

T ypical workfmow

Testing matrix Defines relevant tests in Jenkins Writes code to pass required tests Dev/Stage env. Application developer Infrastructure expert Reads description

  • f execution environment tests

Promote a build to CVMFS

slide-20
SLIDE 20

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Dependency management simple case

  • Common problem with applications :

need a specific version of a compiler

  • Compiling the compiler can itself be

tricky...

  • Jenkins tests the full dependency

chain necessary

slide-21
SLIDE 21

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Real-world application

  • GADGET –

astrophysics hydrodynamic simulations

  • Many (levels of)

dependencies

slide-22
SLIDE 22

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Public Application Dashboard

slide-23
SLIDE 23

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Authenticated view

slide-24
SLIDE 24

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Generic build script

# GADGET requires HDF5 FFTW2 ZLIB and

  • penmpi

module add ci module add fftw/2.1.5 module add hdf5 module add openmpi module add gsl # GADGET requires HDF5 FFTW2 ZLIB and

  • penmpi

module add ci module add fftw/2.1.5 module add hdf5 module add openmpi module add gsl

rm -rf $FFTW_DIR tar xvfz /repo/$SITE/$OS/$ARCH/fftw/$FFTW_VERSION/build.tar.gz -C / rm -rf $HDF5_DIR tar xvfz /repo/$SITE/$OS/$ARCH/hdf5/$HDF5_VERSION/build.tar.gz -C / rm -rf $OPENMPI_DIR tar xvfz /repo/$SITE/$OS/$ARCH/openmpi/$OPENMPI_VERSION/build.tar.gz -C / rm -rf $GSL_DIR tar xvfz /repo/$SITE/$OS/$ARCH/gsl/$GSL_VERSION/build.tar.gz -C / rm -rf $FFTW_DIR tar xvfz /repo/$SITE/$OS/$ARCH/fftw/$FFTW_VERSION/build.tar.gz -C / rm -rf $HDF5_DIR tar xvfz /repo/$SITE/$OS/$ARCH/hdf5/$HDF5_VERSION/build.tar.gz -C / rm -rf $OPENMPI_DIR tar xvfz /repo/$SITE/$OS/$ARCH/openmpi/$OPENMPI_VERSION/build.tar.gz -C / rm -rf $GSL_DIR tar xvfz /repo/$SITE/$OS/$ARCH/gsl/$GSL_VERSION/build.tar.gz -C /

Set up the environment Clean build, retrieve dependency artifacts

slide-25
SLIDE 25

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

Generic build script

make install DESTDIR=$WORKSPACE/build mkdir -p $REPO_DIR rm -rf $REPO_DIR/* tar -cvzf $REPO_DIR/build.tar.gz -C $WORKSPACE/build apprepo make install DESTDIR=$WORKSPACE/build mkdir -p $REPO_DIR rm -rf $REPO_DIR/* tar -cvzf $REPO_DIR/build.tar.gz -C $WORKSPACE/build apprepo

Actually build... Create the artifact

cat <<MODULE_FILE #%Module1.0 ## $NAME modulefile ## proc ModulesHelp { } { puts stderr " This module does nothing but alert the user" puts stderr " that the [module-info name] module is not available" } preqreq("gsl","fftw/2.1.5","hdf5") module-whatis "$NAME $VERSION." setenv GSL_VERSION $VERSION setenv GSL_DIR /apprepo/$::env(SITE)/$::env(OS)/$::env(ARCH)/$NAME/$VERSION prepend-path LD_LIBRARY_PATH $::env(GSL_DIR)/lib MODULE_FILE ) > modules/$VERSION cat <<MODULE_FILE #%Module1.0 ## $NAME modulefile ## proc ModulesHelp { } { puts stderr " This module does nothing but alert the user" puts stderr " that the [module-info name] module is not available" } preqreq("gsl","fftw/2.1.5","hdf5") module-whatis "$NAME $VERSION." setenv GSL_VERSION $VERSION setenv GSL_DIR /apprepo/$::env(SITE)/$::env(OS)/$::env(ARCH)/$NAME/$VERSION prepend-path LD_LIBRARY_PATH $::env(GSL_DIR)/lib MODULE_FILE ) > modules/$VERSION

Create the modulefile

slide-26
SLIDE 26

Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za

So, it works ! … almost Next steps

  • We have an open, collaborative, low-barrier platform for researchers

to bring applications to the grid

  • Small technical tasks :
  • Implement promoted builds mechanism to populate sagrid.ac.za CVMFS repo
  • Implement SAML AuthN, integrate IdF
  • Probes to check that CVMFS is mounted on sites (?)
  • Operating in ”stealth mode” at the moment – not advertising, but open

to anyone who is interested to collect feedback

  • Addressing specific user communities to test drive the system:
  • Machine learning astro applications (rapid prototyping)
  • Bioinformatics application suites (complex ecosystem)
  • Present next phase of the project in November in Cape Town – move

to production