StashCache Derek Weitzel Open Science Grid (with slides from Brian - - PowerPoint PPT Presentation

stashcache
SMART_READER_LITE
LIVE PREVIEW

StashCache Derek Weitzel Open Science Grid (with slides from Brian - - PowerPoint PPT Presentation

StashCache Derek Weitzel Open Science Grid (with slides from Brian Bockelman) 1 2015 OSG All Hands Meeting Northwestern University 2 Motivation Opportunistic Computing is like giving away empty airline seats; the plane was going to fly


slide-1
SLIDE 1

StashCache

Derek Weitzel Open Science Grid (with slides from Brian Bockelman)

1

slide-2
SLIDE 2

2015 OSG All Hands Meeting Northwestern University

2

slide-3
SLIDE 3

Motivation

Opportunistic Computing is like giving away empty airline seats; the plane was going to fly regardless. Opportunistic Storage is like giving away real estate.

3

slide-4
SLIDE 4

Motivation

  • Using the SE paradigm has been a colossal failure for opportunistic VOs.
  • Systems for CMS and ATLAS are robust and efficient, but proven

impossible for others. Cost of management is too high and opportunistic VOs are unable to command site admin time.

  • Key to this failure is the underlying assumption in the SE paradigm that file

loss is an exceptional event.

  • Again, “Storage is like real estate.”
  • To be successful, opportunistic storage must treat file loss as a

everyday, expected occurrence.

  • The lack of high-speed local storage significantly decreases the range of

workflows opportunistic VOs can run on the OSG.

4

slide-5
SLIDE 5

Caching

  • A file is downloaded locally to the cache from an origin server on first access.
  • On future accesses, the local copy is used.
  • When more room needs to be made for access, “old” files are removed (by some

algorithm which decides the definition of “old”).

  • Downsides:
  • Caching is only useful is the working set size is less than the cache size.
  • Otherwise, the system performance is limited to the bandwidth of the system

feeding the cache.

  • Working set size is difficult to estimate for multi-VO.
  • Not all workflows are supported. This does not work well if files need to be

modified.

5

slide-6
SLIDE 6

StashCache

6

UCSD UNL UChicago Syracuse BNL Illinois

slide-7
SLIDE 7

Growth of StashCache

  • Syracuse did not start as an initial site for StashCache
  • They wanted StashCache for 2 reasons:
  • Decrease the network load on the WAN from OSG jobs
  • Cache locally the LIGO data set (discussed later)

7

slide-8
SLIDE 8

Growth

  • Now, Syracuse StashCache is contributing to OSG

StashCache federation

  • For example, over the last 24 hours, transferring data out
  • f the cache on average 7.7Gbps

8

slide-9
SLIDE 9

StashCache

  • 1. User places files on the 


OSG-Connect “origin” server

  • 2. Jobs request the file from the

nearby Caching Proxy

  • 3. Caching Proxy query the

federation for location of the file

9

OSG Redirector OSG-Connect Source IF Source Caching Proxy Caching Proxy Caching Proxy Caching Proxy Job Job Job Download Redirect Discovery

slide-10
SLIDE 10

How is it used?

  • CVMFS - Most common
  • StashCP - Custom developed tool
  • Uses CVMFS when possible, falls back to XRootD tools

10

slide-11
SLIDE 11

CVMFS

  • Fuse based filesystem

/cvmfs/stash.osgstorage.org/user/dweitzel/public/blast/data/yeast.aa

Use CVMFS service Domain for CVMFS (not necessarily a web address) Cached Filesystem Namespace Data transferred through
 StashCache

11

slide-12
SLIDE 12

CVMFS

  • Filesystem Namespace is cached on the site’s HTTP

proxy infrastructure

  • Read-Only filesystem
  • User’s can run regular commands on the directories (ls,

cp, …)

12

slide-13
SLIDE 13

StashCache + CVMFS

  • A service periodically scans the origin server, publishes

the filesystem to CVMFS

  • Looks for changes
  • Checksum the changed files
  • Actual CVMFS namespace only stores the checksum and

meta information

  • DOI: 10.1088/1742-6596/898/6/062044

13

slide-14
SLIDE 14

StashCP

  • Custom tool developed by StashCache team
  • Uses GeoIP to determine the ‘nearest’ cache
  • Uses CVMFS if available, otherwise uses XRootD tool to

copy from cache

14

slide-15
SLIDE 15

What to Use?

  • CVMFS:
  • Takes up to 8 hours for files to appear
  • POSIX like interface, can even open() the file
  • StashCP
  • Files are instantly available to jobs.
  • Batch copy mode only

15

slide-16
SLIDE 16

Monitoring / Accounting

16

slide-17
SLIDE 17

Per-File Monitoring (beta)

Minerva (FNAL)

17

slide-18
SLIDE 18

Science Enabled

  • Minerva - Public Data
  • LIGO - Private data
  • Bioinformatics - Public Data

18

slide-19
SLIDE 19

Minerva adopts StashCache

  • Minerva was seeing very poor efficiency in jobs; lots of waiting to copy "flux" files

(inputs to neutrino MC)

  • Jobs could not proceed until copying finished
  • Suggested switching to StashCache over CVMFS to alleviate load of simultaneous

copies

  • Make symlinks to files in /cvmfs/minerva.osgstorage.org/ in same place as previous

copies were going (no change to code downstream)

  • Worked very well at first, but large volume of jobs eventually seemed to slow down.

Pulling too much from HCC?

  • Supposed to be set up for on-site jobs to read directly from source (FNAL dCache)

rather than going to the Neb. redirector. Currently verifying that was set up correctly. Expect redirector load to decrease once that's verified and corrected as needed.

19

slide-20
SLIDE 20

LIGO + StashCache

  • LIGO data is private for a few years
  • Protected data by using a secure federation
  • CVMFS uses the X509 certificate from the user’s

environment

  • Certificate is propagated to the cache server to access

the data

  • Publication: DOI 10.1145/3093338.3093363

20

slide-21
SLIDE 21

LIGO Data Access

  • Roughly 1Mbps per core
  • 2016: 13.8 Million Hours - 5.8PB
  • 2017: 8.2 Million Hours - 3.4 PB

21

slide-22
SLIDE 22

UNL Bioinformatics Core Research Facility Microbiome composition changes (often rapidly) over time

22

slide-23
SLIDE 23

Bioinformatics (JeanJack)

  • Each job scans a 25GB data set 3 times.
  • The 25GB is stored within StashCache, pulled down for

each job.

  • Copied to local node to optimize second and third scan

23

slide-24
SLIDE 24

Summary

  • Due to CVMFS caching on the local filesystem, we only

have lower end estimates

  • Over the last 1 year:
  • ~10PB data transferred
  • ~88% Cache hit rate

24

slide-25
SLIDE 25

What’s Next

  • Writable Stash
  • Uses Authentication with SciTokens
  • File Based Monitoring

25

slide-26
SLIDE 26

Writable Stash

  • We have always had issues with writing back to Stash
  • Options can include:
  • HTCondor’s Chirp: requires going back through submit

host

  • SSH Key: Have to transfer your SSH key onto the OSG

26

slide-27
SLIDE 27

Writable Stash

  • Uses Bearer Tokens — SciTokens
  • Short lived tokens with very restrictive capabilities

> PUT /user/dweitzel/stuff HTTP/1.1 > Host: demo.scitokens.org > User-Agent: curl/7.52.1 > Accept: */* > Authorization: Bearer XXXXXXXX

XRootD / Stash

27

slide-28
SLIDE 28

Resources

  • Admin Docs:
  • https://opensciencegrid.github.io/StashCache/
  • User Docs (OSG User Support maintained):
  • https://support.opensciencegrid.org/support/solutions/

articles/12000002775-transferring-data-with- stashcache

28