dCache: status update and future directions Paul Millar TERENA - - PowerPoint PPT Presentation

dcache status update and future directions paul millar
SMART_READER_LITE
LIVE PREVIEW

dCache: status update and future directions Paul Millar TERENA - - PowerPoint PPT Presentation

dCache: status update and future directions Paul Millar TERENA Storage TF Uppsala, Sweden What is dCache? Introducing dCache OpenSource software for aggregating heterogeneous storage Immutable filesystem with its own namespace


slide-1
SLIDE 1

dCache: status update and future directions Paul Millar TERENA Storage TF Uppsala, Sweden

slide-2
SLIDE 2

What is dCache?

slide-3
SLIDE 3

Introducing dCache

  • OpenSource software for aggregating heterogeneous storage
  • Immutable filesystem with its own namespace independent of

data location,

  • Integrates with tertiary storage (tape)
  • Sophisticated data-placement
  • Built-in support for multiple protocols (NFS, FTP, HTTP/WebDAV, …)
  • Consistent and coherent view of the files.
  • Pluggable authentication / identity system
  • Supports X.509 client cert, username+password and Kerberos
  • Integrates with site IdM: NIS, LDAP, Active Directory, Kerberos, ...
slide-4
SLIDE 4

dCache| Paul Millar | 2014.9.22 | Page 4

Message passing layer

dCache in one slide

Pools

(Data Server)

Pools

(Data Server)

Door JVM JVM JVM Door(s)

(clients entry point)

Pool Manager

(requests scheduler)

Name Space

(MetaData Server)

Pools

(Data Server)

DBMS dcap ftp http nfs (Slide stolen from Tigran)

slide-5
SLIDE 5

dCache: people and support

  • Core team (8 people): collaboration between DESY, Fermilab

and NEIC,

  • Students: HTW Berlin,
  • External contributors: people making infrequent contributions
  • German support group: volunteer dCache admins who
  • rganise and run workshops
  • Support channels:
  • User forum where users (i.e., admins) help each other
  • Direct channels (support@dcache.org and security@dcache.org)
slide-6
SLIDE 6

dCache: funding

  • Core partners: DESY, Fermilab, NEIC
  • German government: LSDMA project

PoF →

  • EU projects:

FP7 projects (EMI) and in three H2020 proposals.

slide-7
SLIDE 7

WLCG dCache instances (non-WLCG not shown)

slide-8
SLIDE 8

Deployments (just some of 'em...)

  • WLCG: 44 sites (world-wide) together provide 100 PB, satisfying

~50% of LHC current requirement.

  • DESY: HERA, ATLAS, CMS, LHCb, Photon science, ...
  • Fermilab: CMS, general storage, Intensity Frontier, ...
  • BNL: ATLAS and RHIC.
  • SNIC: SweStore.
  • NDGF: geographically largest single instance, spread over 5

countries. ... <Your Name Here>

slide-9
SLIDE 9

dCache server releases

... along with the series support durations.

TODAY

2.13 series (anticipated goldern release) 2.12 series (anticipated release) 2.11 series (anticipated release) 2.10 series (golden release) 2.9 series 2.8 series 2.7 series 2.6 series (golden release)

Jun Jul Aug Sep Oct Nov Dec

2014 2015

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2016

Jan Jun Jul Aug Sep Oct Nov Dec

2014 2015

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2016

Jan

slide-10
SLIDE 10

The code-base

  • Open Source license (AGPL)
  • Code available in github

four commands (one of which is 'cd') gives you a fully functional, running dCache on your laptop.

  • All changes subject to code-review,
  • Large sections of the functionality are extensible / pluggable.
  • Spun off some functionality as independent libraries:
  • Code used by banks, other storage system vendors, ..
  • We only know who from the bug reports and bugfixes
slide-11
SLIDE 11

Status updates and Future directions

slide-12
SLIDE 12

dCache the scientific cloud

slide-13
SLIDE 13

Improving data-injection performance

slide-14
SLIDE 14

How to store small files on tape

  • Small files are bad for tapes
  • Load/seek time vs read time.
  • Random selection

many tape mounts slow access & broken tapes. → →

  • Solution: dCache collects files in a container (a zip file) before writing to tape

Replacing lots of shuttle-buses with one big bus

  • When user writes new files:
  • “Small files” are written into dCache,
  • dCache groups files and, based on policies, writes a container back into dCache,
  • Containers are written to tape.
  • When user opens a file for reading:
  • Fetch container from tape, if not cached
  • Extract file from container
  • User sees no difference, yet tape is better utilised.
slide-15
SLIDE 15

HTTP and WebDAV

  • Added support for HTTP and WebDAV.
  • Support redirection on read, redirection on write.
  • Metadata operations can be encrypted; when

redirected, data is transported unencrypted.

  • Found problems with (almost) all webdav clients.
  • Extending WebDAV to include additional

functionality:

  • Added support for triggering 3rd party copy,
  • Added support for recovery in dynamic data federation.
slide-16
SLIDE 16

HTTP Federation

  • Project in collaboration with CERN
  • Multiple HTTP/WebDAV servers provide users an overlap

namespace

Like partial mirrors of some central repository

  • Central server provides an aggregate view
  • Assume that if files exists in multiple server, they are identical replicas
  • Client sees all available files
  • When reading, the client is redirected to “best” replica.
  • Available as a demo; being evaluated by WLCG experiments
slide-17
SLIDE 17

Developing dCache sync-n-share

  • Provide unlimited storage:
  • Access via web-browser:
  • Synchronisation:
  • Sharing:
  • how do we present shared data to the user?
  • how do users share data with others?
slide-18
SLIDE 18

DESY sync-and-share service

  • DESY users needed to stop using DropBox.
  • dCache already started working on

adding sync-and-share facilities.

  • For DESY, using dCache and ownCloud to

build a DropBox-like service was the best

  • ption.
slide-19
SLIDE 19

dCache with ownCloud

  • Use ownCloud on top of dCache, via NFS

Files in dCache owned by the user (not ownCloud process)

  • Users can write data into dCache

Immediately visible through ownCloud.

  • Users can write data into ownCloud (sync client)

Immediately visible through dCache

  • Limitations:
  • If user shares data with you, you can only read that through ownCloud.
  • If you set ACL in dCache, not reflected in ownCloud
  • Service goes live today (for the brave); DESY-wide in two weeks.
slide-20
SLIDE 20

What is the sync-n-share future?

  • Have the client sync directly with dCache.

maybe the ownCloud client

  • Add support for sharing within dCache.

enhanced web interface

  • Drop ownCloud and provide a pure

dCache solution.

slide-21
SLIDE 21

CDMI: managing cloud storage

  • Network protocol for Cloud storage
  • initially by SNIA, now an ISO standard
  • with many, many features
  • Limited vendor uptake:

Catch-22: demand and availability

  • Some IAAS systems use CDMI internally,

the EGI FedCloud has CDMI as a common requirement

  • Preliminary support for dCache from student project,

Not available now, but plan to integrate (after code review)

  • What is the demand?
slide-22
SLIDE 22

gPlazma: flexible identity management

  • dCache's IdM identity management system:
  • (mostly) authenticates user,
  • figures out their uid, gid(s),
  • rejects banned users,
  • discovers session information: home directory …
  • Public API: anyone can write a plugin.
  • We supply plugins for NIS, LDAP, ActiveDirectory,

Kerberos, X.509, VOMS, XACML, PAM and some local files (e.g., htaccess).

slide-23
SLIDE 23

Federated Identity

  • Increasing need to “do something”
  • SAML seems prevalent system

OpenID Connect is also gaining traction.

  • With LSDMA: initial work on credential

translation (SAML X.509) →

  • Later, add native SAML support:

Initially with Web-SSO, later maybe Moonshot/AbFab.

slide-24
SLIDE 24

Globus (Online)

  • Globus (Online) provides a file-movement

service,

  • Data connections always authenticate via X.509

Globus can use externally-generated credentials

  • LSDMA providing a “glue” service:
  • Germany's DFN-AAI run a SLCS (a bit like TCS).
  • The glue service allow Globus users to use the SLCS.
slide-25
SLIDE 25

Software Defined Storage & QoS

  • dCache can already provide differentiated QoS

(Quality of Service):

Different files can have different replication factors, multi- tier (SSD, HDD, tape) usage, utilise different hardware

  • Currently these QoS attributes are most configured

by the dCache admin.

  • We are investigating SDS to allow:
  • Modification of QoS after data is written,
  • Allow users finer grain control of QoS choices.
slide-26
SLIDE 26

Summary

  • We are adding Cloud-like features, both interactive

(currently via ownCloud) and through protocols (like CDMI) – rolling out a production service at DESY.

  • Investigating how to integrate support for Federated

Identity into storage software

  • For more than 10 years, dCache provides Big Data

storage software that:

  • focuses on users needs,
  • implements state-of-the-art features,
  • pushing user expectations by exposes users to innovation.
slide-27
SLIDE 27

Backup slides

slide-28
SLIDE 28

The grid solution: X.509 (user) certificates

The Grid

Proxy Certificate User Certificate

slide-29
SLIDE 29

Federated Identity

Check who you are & Authorisation decision Check who you are Authorisation decision Identity Provider (IdP) Assertion Service Provider (SP) Record information

slide-30
SLIDE 30

SAML Web Single Sign-On (Web SSO)

Service User

  • 1. I want

to log in

  • 2. Prove

to me who you are

  • 3. Some

proof (name + password)

  • 4. OK,

I believe you, your logged in

Normal logging in

Service Provider (SP) User

  • 1. I want

to log in

  • 2. Go to

the IdP and come back with proof you've proved who you are.

Identity Provider (IdP)

  • 3. I want

to log in, Service sent me

  • 4. Prove

to me who you are

  • 5. Some

proof (name & password)

  • 6. OK,

I believe you, hand this assertion back to Service

  • 7. Back

again! Here's your proof

  • 8. Looks

good, you're logged in

Logging in with SAML (Web-SSO)

slide-31
SLIDE 31

“Where Are You From?” the WAYF

Service Provider (SP)

User

  • 1. Start

login Identity Provider (IdP)

  • 3. Authen-

ticate

  • 4. Redirect

back to Service with assertion.

  • 5. Present

assertion.

  • 6. Looks

good, you're logged in

  • 2. Redirect

to IdP

SAML WebSSO without WAYF

Service Provider (SP)

User

  • 1. Start

login WAYF

  • 5. Authen-

ticate

  • 6. Redirect

back to Service, with assertion.

  • 7. Present

assertion.

  • 8. Looks

good, you're logged in

  • 2. Redirect

to WAYF

  • 3. Choose

IdP

  • 4. Redirect

to IdP Identity Provider (IdP)

Logging in with WAYF

slide-32
SLIDE 32

Who do you trust?

Identity Provider (IdP) Assertion Service Provider (SP) Will information be abused or leaked? Will they track users' activities? Will they tell me if there is suspicious behaviour? Is this really the same person as before? Is the information accurate?

slide-33
SLIDE 33

How to trust lots of people?

Point-to-point trust doesn't scale! Federation Inter-Federation

e.g., EduGain

slide-34
SLIDE 34

Using (remote) computers

Identity Provider (IdP) Service Provider (SP) – A web portal Computing Resource Web portal

slide-35
SLIDE 35

Using (remote) computers

Identity Provider (IdP) Service Provider (SP) – Access portal Computing Resource LDAP Substitute Credential (upload ssh public key)

slide-36
SLIDE 36

Using (remote) computers

Identity Provider (IdP) Computing Resource Service Provider (SP) Project Moonshot

slide-37
SLIDE 37

Managing (remote) data

Identity Provider (IdP) Service Provider (SP) – A web portal Storage Resource Web portal

slide-38
SLIDE 38

Managing (remote) storage

Identity Provider (IdP) Service Provider (SP) – Access portal Storage Resource Fetch Substitute Credential

Token: Amazon AWS/S3 SAML support, X.509: SLCS, TCS, CI-Login, EMI STS, ...

slide-39
SLIDE 39

Managing (remote) storage

Identity Provider (IdP) Storage Resource Service Provider (SP) Project Moonshot

slide-40
SLIDE 40

Identity: Theory, Practice and Future directions | Paul Millar | 2014-03-24 | Page 40

Credential vs Principal

Credentials Principals

Name: Wile E. Coyote ACME customer ID: 11493 Member-of: Antagonists Anonymous Passport number: 0008103314 Bank account number: 001213921 Banks with: United ACME Bank `

slide-41
SLIDE 41

Identity: Theory, Practice and Future directions | Paul Millar | 2014-03-24 | Page 41

Authentication: door, both or gPlazma

WebDAV door gPlazma Authn Map NFS door

Principal

Kerberos

ticket

Kerberos

CERTIFICATE

X.509

CERTIFICATE

X.509

slide-42
SLIDE 42

Identity: Theory, Practice and Future directions | Paul Millar | 2014-03-24 | Page 42

Logging in: four phases, using plugins

slide-43
SLIDE 43

Identity: Theory, Practice and Future directions | Paul Millar | 2014-03-24 | Page 43

Something extra: identity mapping