A DevOps Approach to Integration of Software Components in an EU - - PowerPoint PPT Presentation

a devops approach to integration of software components
SMART_READER_LITE
LIVE PREVIEW

A DevOps Approach to Integration of Software Components in an EU - - PowerPoint PPT Presentation

A DevOps Approach to Integration of Software Components in an EU Research Project Mark Stillwell Jose G. F. Coutinho Department of Computing Imperial College London, UK September 1, 2015 Software as Research An article about computational


slide-1
SLIDE 1

A DevOps Approach to Integration of Software Components in an EU Research Project

Mark Stillwell Jose G. F. Coutinho

Department of Computing Imperial College London, UK

September 1, 2015

slide-2
SLIDE 2

Software as Research

An article about computational science in a scientific publication is not the science itself, it is merely advertising of the scholarship. The actual scholarship is in the complete software development environment, [the complete data] and the complete set of instructions which generated the figures. — David Donoho “Wavelab and Reproducible Research”, 1995

slide-3
SLIDE 3

Funder Expectations

◮ multi-partner collaborations ◮ results data and code as research outputs ◮ reusable and maintainable software ◮ plans for long-term stewardship

slide-4
SLIDE 4

HARNESS Project Summary

◮ EU FP7 funded cloud computing project ◮ makes available various heterogeneous resources ◮ multiple sub-projects developed by independent teams ◮ need to provide coherent demonstrator platform

slide-5
SLIDE 5

HARNESS Project Teams

Integration Team Storage Team Network Team Platform Team Compute Team

slide-6
SLIDE 6

HARNESS Project Architecture

volume reservation available OSDs OpenStack Nova Controller IRM-NOVA VMs

Cross-Resource Scheduler (CRS) IRM-XtreemFS storage devices IRM-SHEPARD HW accelerators

XtreemFS Directory MaxelerOS Orchestrator OpenStack Neutron Controller switches XtreemFS client

MPC-X MPC-X DFE GPGPU FPGA

OSDs/ MRCs AlphaData MaxelerOS execute task status Executive POSIX read/write

  • perations

Application Module ConPaaS IRM-NEUTRON network resources Neutron Agent servers submit application feedback feedback submit app, manifest, SLO XtreemFS Scheduler available resources feedback reservation request Application Manager (AM) Application Manager (AM) Users Users ConPaaS agent virtual machines IRM-NET networked VMs Nova Compute PCIe device reservation SHEPARD Compute allocation request feedback manage manage deploy and execute services + applications DFE reservation

OpenCL

Platform Layer Infrastructure Layer Service Layer

slide-7
SLIDE 7

Testbed Environments

◮ Imperial Testbed

◮ small scale ◮ static environment with shared systems ◮ specialized hardware (GPU, MPC-X, SSD cards)

◮ Grid5000 Testbed

◮ medium to large scale, some multi-site deployments ◮ dynamic environment ◮ virtual networking links ◮ some specialized hardware (GPU, Intel Phi)

slide-8
SLIDE 8

Initial Approach

◮ developer virtual machine images ◮ interactive configuration with some scripting (bash, devstack) ◮ scheduled releases of updated images

slide-9
SLIDE 9

Significant Issues

◮ difficulty merging, managing, and tracking changes ◮ individual developer VMs tend to “drift” over time. . . ◮ fragmentation: hard to point to a definitive latest version ◮ difficult to debug or identify differences between images ◮ time-consuming and error-prone deployment to testbeds

slide-10
SLIDE 10

Objectives for New Approach

◮ let developers easily work individually ◮ turn configuration/setup issues into software issues ◮ allow for version control, merging ◮ allow for automated acceptance testing

slide-11
SLIDE 11

Differences from Commercial Requirements

◮ priority is individual research contributions ◮ lower focus on ease-of-use ◮ more need for customization ◮ need for reproducibility

slide-12
SLIDE 12

Technologies

◮ Git / GitLab / GitHub ◮ Ansible ◮ Docker ◮ Vagrant ◮ Buildbot

slide-13
SLIDE 13

DevOps Workflow

Automated Unit Tests Development Team Version Control

check-in F trigger feedback check-in trigger

Automated Integration Tests

trigger F P feedback check-in release approval P

Integration Team Release

feedback feedback P feedback F = fail P = pass deployment and automated testing in a virtual development environment

F

F Testbed Deployment check-in trigger P feedback P feedback

F

P deployment and testing in Grid’5000 and Imperial Cluster testbeds deployment and testing in Grid’5000 and Imperial Cluster testbeds trigger

slide-14
SLIDE 14

Role of Docker

◮ Docker used whenever possible

◮ some services need global machine state

◮ provides static release images, with some configuration ◮ isolates projects from each other

slide-15
SLIDE 15

Deployment Projects

◮ use ansible for orchestration and configuration management ◮ unify sub-projects, pull in from multiple repositories ◮ ansible ensures configuration changes are “idempotent”, so

can be run repeatedly on static testbed

slide-16
SLIDE 16

Virtual Machine Environments

◮ configured using vagrant+ansible

◮ developer just checks out deployment, runs “vagrant up”

◮ allows developers to work independently ◮ easy to re-initialize or update

slide-17
SLIDE 17

Reproducible Deployment

◮ developers work in the same environment, changes easily

merged

◮ experiments and benchmarks can be validated at a later date ◮ software can be deployed on novel testbeds ◮ rapid recovery in case of hardware failure

slide-18
SLIDE 18

Automated Testing

◮ unit tests for individual projects ◮ integration test for full deployment

slide-19
SLIDE 19

Buildbot

GitLab server Git repositories

poll manage change status build commands build commands Imperial DoC Cloud unit test buildslave (a) build slave process build slave process

Docker container Docker container

Physical Server

integration test buildslave (b)

build slave process Vagrant

manage manage VM 1 VM 2

Multi-VM Vagrant deployment running all the HARNESS cloud services, which interact with each other. Some of the VMs run services deployed within Docker containers Each component is tested in isolation within a Docker container

BuildBot Master

slide-20
SLIDE 20

Shortcomings

◮ difficult for non-experts to make configuration changes ◮ difficult to manage temporary branch changes for developers ◮ considerable development burden shifted to integration team ◮ running the full virtual deployment in vagrant on server takes

a long time

slide-21
SLIDE 21

Lessons Learned

◮ ability to refresh developer vms extremely useful ◮ automated deployment also very useful, but requires expert

supervision

◮ testing results often ignored

slide-22
SLIDE 22

Future Recommendations

◮ gatekeeper for authoritative versions

◮ do not publish/merge changes until tests are passed!

◮ look into other techniques for projects with multiple

repositories

◮ run integration tests against cloud back end

slide-23
SLIDE 23

Discussion Points

request for feedback/discussion about a particular point in our work:

How can we better manage projects that pull from multiple repositories?

thought-provoking statement or discussion question about the area:

What are appropriate metrics of success for this type of project?