The ! !dCache ! !labs 7 th !International !dCache !Workshop - - PowerPoint PPT Presentation

the dcache labs
SMART_READER_LITE
LIVE PREVIEW

The ! !dCache ! !labs 7 th !International !dCache !Workshop - - PowerPoint PPT Presentation

The ! !dCache ! !labs 7 th !International !dCache !Workshop Patrick !Fuhrmann Welcome !to !7 th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !1 Content CMS Disk / Tape separation


slide-1
SLIDE 1

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !1

7th !International !dCache !Workshop Patrick !Fuhrmann

The ! !dCache ! !labs

slide-2
SLIDE 2

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !2

Content

  • CMS Disk / Tape separation
  • dCache supporting federated IdM
  • Multi Tier Storage
  • Small file support to optimize tape
  • Single client performance
  • Scientific Storage Cloud
slide-3
SLIDE 3

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !3

Completed

  • gPlamza 2
  • NFS 4.1
  • WebDAV
slide-4
SLIDE 4

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !4

CMS Tape Disk Separation

slide-5
SLIDE 5

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !5

CMS Disk / Tape separation

  • CMS is planning to strictly separate disk and tape storage

elements at the Tier I level. – With the available network bandwidth of the OPN, it should be faster to take data from another Disk-Tier-1, than from Tape. – CMS would like to reduce the number of Tier-I’s with

  • Tape. (Complex and expensive management)

PhEDEx Tape Node PhEDEx Disk Node e.g. Bring Online

slide-6
SLIDE 6

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !6

A possible solution

PhEDEx Tape Node PhEDEx Disk Node

  • A single dCache pretends to be ‘two’
  • Highly customized PhEDEx Adapter

– Stat of file has to be replaced by location query

  • Transitions (limited selection)

– Get file from tape to disk : -> Done : Bring Online – Migrate file to tape (selectively) – Accept file to disk (from other Tier I) which is already on tape locally – Remove files from tape but keep file on disk

slide-7
SLIDE 7

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !7

Other solution

Phedex Tape Node Phedex Disk Node

  • A single dCache with two similar name space trees

– One as tape endpoint and the other as disk endpoint

  • PhEDEx Adapter nearly unchanged
  • Transitions

– Get file from tape to disk : -> Done : Bring Online – Migrate file to tape (selectively) – Accept file to disk which is already on tape locally (different file in dC.) – Remove files from tape but keep file on disk

slide-8
SLIDE 8

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !8

CMS Disk / Tape separation (cont)

  • Plan

– PIC (Pepe) is organizing the effort and will help us evaluating solutions. Support from other sites is welcome. – We can begin right away with two completely independent name spaces in one dCache. – We can work on the optimization gradually. – Interesting: flush files to tape individually or conditionally

slide-9
SLIDE 9

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !9

Federated Identities

slide-10
SLIDE 10

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !10

Federated Identities

  • General issue:

– Use credentials from site-A to access data at site-B.

  • Plenty of possible combinations

– SAML or X509 including conversion (e.g. STS) – Web-based (including ECP Profile) – Generic (no portals involved) – And all possible combinations

  • We will agree on an example setup

– “Relying Party”: dCache for sure. – Likely SAML support – Details need to be negotiated in LSDMA WP1

  • Goal for dCache :

– Accept (federated) Identity Providers

  • OpenID (Google, Facebook), Shibboleth , SAML, Umbrella
slide-11
SLIDE 11

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !11

Multi Tier Storage

slide-12
SLIDE 12

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !12

Multi Tier Storage

DISK SSD SSD SSD

Streaming

DISK DISK

Streaming Chaotic

Tape Tape Tape

NFS xRoot dCap gridFTP http(s) WebDAV

This you can already do with dCache, BUT

slide-13
SLIDE 13

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !13

Multi Tier Storage

  • Can already be configured, but

DISK SSD

  • Tigran : Better would be

SSD DISK

  • Will be done, if we find resources
slide-14
SLIDE 14

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !14

Small file support for tape

slide-15
SLIDE 15

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !15

What’s the issue

  • 0 Byte files occupy between .5 and 1.6 Mbytes on tape. So, small

files are wasting space.

  • Writing file marks forces the drive to synchronize tape writing (halts

streaming)

  • LTO Spec :

– 80 Seconds max seek time – 50 Seconds average – Which means: For reading files from tape, which are not exactly in order, each transfer takes about 50 Seconds minimum.

  • If data is not on same tape, mount/dismount has to be added (30 – 60

Seconds)

  • Tape systems consist of 3 non-shareable units :

– Robot (Arm and gripper) – Drive – Tape

Or, Why do small files kill tape systems ?

slide-16
SLIDE 16

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !16

Our suggestion

  • Decision on whether files are “large” or “small” will be initially

based on directories.

  • Transparent for the user:

– We ‘tar’ or ‘cpio’ files before they are flushed to tape. – We extract the correct file from the archive if needed.

  • Options:

– Only the requested file is extracted, or – when the first file of a container is requested, dCache could extract all files of the container.

  • As the container file is still on disk for awhile after the first file

has been extracted (depending on space availability), subsequent requests for small files will be handled w/o further tape access.

  • We could even pin recalled containers for some time.
  • “On top service” Runs on already supported dCache versions.
slide-17
SLIDE 17

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !17

Merging small files

dCache Tape System

TAR Merging process TAR

slide-18
SLIDE 18

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !18

Extracting small file(s)

dCache Tape System

TAR TAR

slide-19
SLIDE 19

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !19

The Dynamic http/WebDAV federation

slide-20
SLIDE 20

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !20

Dynamic Federation

Federation Service

Best Match Engine

GEO IP

LFC Catalogue

Other http enabled SE’s

dCache Any cloud provider

Candidate Collection Engine

ROOT

WGET CURL Nautilus Dolphin Konqueror

DavIX

Portal

One or more candidates

slide-21
SLIDE 21

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !21

Single access performance

slide-22
SLIDE 22

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !22

Single client performance

  • Up to know, dCache focused on the optimization of overall

performance

– Transaction rates (stats) – Transfer speed

  • Consequence:

– Single client transaction time is high compared to high-end systems e.g. GPFS.

  • With new requirements from new communities this needs

some adjustment.

– Tigran already started to profile meta-data transactions (open,…) – Already clear: Head-room for improvements – Work will continue, we’ll keep you updated.

slide-23
SLIDE 23

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !23

How does all this fits together ?

  • Supporting individual identity management, remote IdP’s
  • Allowing gPlazma to be integrated into the site

infrastructure (Ron’s presentation)

  • Supporting ‘small’ files for tape
  • Supporting individual disk->tape transactions (CMS

request)

  • Improving single client transaction rate
slide-24
SLIDE 24

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !24

How does all this fits together ?

  • We are working towards a individualized dCache.
  • All supported protocols (WebDAV, nfs, …) will the same

view of the repository.

  • Various authentication mechanisms (Kerberos, X509,

SAML) point to the same identity.

  • Authorization is only based on the object (file directory)

and the subject (user). -> Protocol independent.

Scientific Storage Cloud

slide-25
SLIDE 25

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !25

Scientific Storage Cloud

  • The same dCache instance can serve

– Globus-online transfers via gridFTP – FTS Transfers for WLCG via gridFTP or WebDAV – Private upload and download via WebDAV – Public anonymous access via plain http(s) – Direct fast access from worker-nodes via NFS4.1

  • The same user can use all those access

mechanisms using a variety of credentials.

– User/password – Kerberos – X509 – SAML assertions

slide-26
SLIDE 26

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !26

Scientific Storage Cloud

slide-27
SLIDE 27

Welcome !to !7th !International !dCache !Workshop !| !HTW !Berlin, !Berlin !| !Patrick !Fuhrmann !| !27 !May !2013 !| !27

Questions

further reading

www.dCache.org