Efficient unpacking of required software from CernVM-FS Samuel - - PowerPoint PPT Presentation

efficient unpacking of required software from cernvm fs
SMART_READER_LITE
LIVE PREVIEW

Efficient unpacking of required software from CernVM-FS Samuel - - PowerPoint PPT Presentation

Efficient unpacking of required software from CernVM-FS Samuel Teuber EP-SFT Openlab Summer Student Nicholas Hazekamp, Jakob Blomer, Gerardo Ganis 13.08.2018 Why is this Lack of internet connection on necessary? (1) compute nodes


slide-1
SLIDE 1

Efficient unpacking of required software from CernVM-FS

Samuel Teuber • EP-SFT Openlab Summer Student Nicholas Hazekamp, Jakob Blomer, Gerardo Ganis 13.08.2018

slide-2
SLIDE 2

Why is this necessary? (1)

Challenges faced in some HPC environments (e.g. NERSC)

  • Lack of internet connection on

compute nodes

  • Lack of local hard disks
  • Lack of system level privileges
slide-3
SLIDE 3

Why is this necessary? (2)

Challenges faced in Benchmarking

  • Run the same workflow every

time

  • Minimize storage needs
  • Minimize external factors

(e.g. internet connection)

slide-4
SLIDE 4

How are these challenges tackled today?

No harddisk Cache on HPC file system No internet connection cvmfs_preload Prepopulate cvmfs cache No FUSE client uncvmfs Download entire CVMFS repositories Filter afterwards Benchmarking

?

slide-5
SLIDE 5

https://assets.nst.com.my/images/articles/26_bajajaaa_1521955064.jpg

slide-6
SLIDE 6

Shrink Wrapping

A method for efficiently packaging required software from CVMFS into standalone images

https://en.wikipedia.org/wiki/Stretch_wrap#/media/File:Pallet_wrapper.jpg

slide-7
SLIDE 7

Specification Building a specification describing the necessary files for a software run Export Export with cvmfs_shrinkwrap (tar, squashfs, docker, ...)

  • No internet
  • No disk cache
  • No fuse client

Run Independently

^/bar/etc/* /bar/Modules/setup.sh /foo/Packages/ROOT/* ^/foo/Packages/AliRoot/*

slide-8
SLIDE 8

Application design

slide-9
SLIDE 9

Image architecture

.provenance/ repo.cern.ch/ /cvmfs/ .data/ 00/ … ff/ .garbage/ Information for image reproducibility Exported repository structure Content addressed file links Garbage Collection information Hardlinks

slide-10
SLIDE 10

FS Traversal

  • Single Threaded (for now)
  • Matches paths to specification (~3us

lookup)

  • In memory ls for directories in

specification

  • Responsible for file creation &

hardlinks

  • Copies files between abstract

interfaces (extendible to other fs architectures)

  • Responsible for IO-intensive copying

Thread pool

slide-11
SLIDE 11

Docker Injection

slide-12
SLIDE 12

Replacing CVMFS docker layers

OS Base Layer Custom Container Layers 1 CVMFS Layer

slide-13
SLIDE 13

1. Identify CVMFS Layer: Hash of layer as Image Label 2. Download “old” layer version 3. Update through shrinkwrap utility 4. Upload “new” layer version 5. Update Image Labels & Manifest OS Base Layer Custom Container Layers 1 CVMFS Layer

slide-14
SLIDE 14

OS Base Layer Custom Container Layers 1 CVMFS Layer New Custom Container Layers 2 1. Identify CVMFS Layer: Hash as Image Label 2. Download “old” image version 3. Update through shrink wrap utility 4. Upload “new” image version 5. Update Image Labels

slide-15
SLIDE 15

Example & Evaluation

slide-16
SLIDE 16

From a vanilla docker image...

FROM centos:7 ... ADD HEP_OSlibs.repo /etc/yum.repos.d/HEP_OSlibs.repo RUN yum install -y HEP_OSlibs

slide-17
SLIDE 17

$ cvmfs_shrinkwrap oci ROOT/root-demo -c hub.docker.com.conf*

Making image CVMFS injectable (injecting empty CVMFS layer)... Generating local copy of specified cvmfs repository subset... Packing tar layer... Compressing to gzip... Injecting updated cvmfs layer into hub.docker.com/ROOT/root-demo... * Command line interface interaction is still subject to change

...to a CVMFS injected image

That can run ROOT demos

slide-18
SLIDE 18

~70 MB/s

Export data rate with warm cache from CVMFS to POSIX folder

slide-19
SLIDE 19

Tracing & Specification Building

A method for automated image specification

slide-20
SLIDE 20

Tracing & Specification Building

A method for automated image specification

slide-21
SLIDE 21

Trace Automated Specification building based

  • n trace file

Specification Trace by enabling CVMFS_TRACEFILE duing workflow

  • No internet
  • No disk cache
  • No fuse client

Run Independently Export Export with cvmfs_shrinkwrap (tar, squash, docker, ...)

slide-22
SLIDE 22

>50k lines O(1) k lines

>

slide-23
SLIDE 23

Future Work

Improve shrink wrapping workflow

Understand exact use cases and optimize system based on these needs

Improve automated specification building

Make use of traces from multiple software runs to build more reliable specifications

Direct exports to other formats than POSIX?

Might be more efficient to avoid the “middleman”