gLite Data Management Agenda gLite Data Management Introduction - - PowerPoint PPT Presentation

glite data management agenda
SMART_READER_LITE
LIVE PREVIEW

gLite Data Management Agenda gLite Data Management Introduction - - PowerPoint PPT Presentation

gLite Data Management Agenda gLite Data Management Introduction Examples Name Convention Storage Elements LCG File Catalog FTS Overview 2 Data Management System (DMS) Provides file manipulation services for users


slide-1
SLIDE 1

gLite Data Management

slide-2
SLIDE 2

2

Agenda

  • gLite Data Management

– Introduction – Examples – Name Convention – Storage Elements – LCG File Catalog

  • FTS Overview
slide-3
SLIDE 3

3

Data Management System (DMS)

  • Provides file manipulation services for users and other

Grid services.

  • DMS enables the location, access and transfer of data

– User do not need to know data location, just the logical name – Data is accessed through standard interfaces – Data can be replicated or transferred to several locations as needed – Data is shared within a VO

slide-4
SLIDE 4

4

Scope of data services in gLite

  • Simply, DMS provides all operation that all of us are used

to performing

  • Uploading /downloading files
  • Creating file /directories
  • Renaming file /directories
  • Deleting file /directories
  • Moving file /directories
  • Listing directories
  • Creating symbolic links
  • Note: Files are write-once, read-many

– Files cannot be changed unless remove or replaced – No intention of providing a global file management system

slide-5
SLIDE 5

5

Data Issues and Grid Solutions

  • Resource centers have growing demand for storage

– Storage Element capable to manage multiple disk pools

  • Disk Pool Manager (DPM), dCache, CASTOR
  • Data is stored on different storage systems technologies

– Common interface required to hide underlying complexity

  • Storage Resource Manager (SRM) – storage management protocol
  • GridFTP – secure file transfer
  • Data is stored at different locations with separate namespace

– File catalogue to provide uniform view of Grid data

  • LCG File Catalog (LFC)
  • Applications need to access Grid data management services

– Data management API

  • GFAL
slide-6
SLIDE 6

6

Data management example

Resource Broker Storage Element Computing Element

DataSets info Input “sandbox” Output “sandbox”

“User interface” LCG FileCatalogue (LFC) Storage Element

  • File replicated onto 2 SEs
slide-7
SLIDE 7

7

Data management example

Storage Element1 “User interface” LCG FileCatalogue (LFC) Storage Element 2

  • File replicated onto 2 SEs

“Myfile.dat” Myfile.dat File_on_se1 File_on_se2 guid

slide-8
SLIDE 8

8

Data management example

Storage Element1 “User interface” LCG FileCatalogue (LFC) Storage Element2

“Myfile.dat” Myfile.dat “Logical filename” File_on_se1 (“SURL”: site URL) File_on_se2 (“SURL”: site URL) “GUID” Global Unique Identifier

slide-9
SLIDE 9

9

Name conventions

  • Logical File Name (LFN)

– An alias created by a user to refer to some item of data, e.g.

“lfn:/grid/cms/20030203/run2/track1”

  • Globally Unique Identifier (GUID)

– A non-human-readable unique identifier for an item of data, e.g.

“guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6”

  • Storage URL (SURL) or Physical File Name (PFN)

– The location of an actual piece of data on a storage system, e.g.

“srm://pcrd24.cern.ch/flatfiles/cms/output10_1” (SRM) “sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” (Classic SE)

  • Transport URL (TURL)

– Temporary locator of a replica + access protocol: understood by a SE, e.g.

“rfio://lxshare0209.cern.ch//data/alice/ntuples.dat”

slide-10
SLIDE 10

10

Storage Element

  • Provides

– Storage space for files – SRM Interface – Transfer protocol (gsiFTP) ~ GSI based FTP server – POSIX-like file access

  • Accessed via Grid File Access Layer (GFAL)
  • API interface
  • To read parts of files too big to copy
  • Example is Disk Pool Manager (DPM)

– Scalable management for independent disk pools for sites – Easy to install, configure and manage – Secure remote and local transfer protocols

  • GridFTP, secure RFIO
slide-11
SLIDE 11

11

LFC Service

  • LFC = LCG File Catalogue

– LCG = LHC Compute Grid – LHC = Large Hadron Collider

  • Provides

– Mapping between LFN, GUID and SURL – Transactions, Sessions, Bulk queries – Hierarchical namespace, symbolic links – System metadata – single string user metadata

  • All members of a given VO have read-write permissions

in their directory

  • Commands look like UNIX with “lfc-” in front (often)
slide-12
SLIDE 12

12

LFC Continued

  • Users primarily access and manage files through “logical

filenames”

  • Mapping by the “LFC” catalogue server

Defined by the user LFC Namespace

LFC has a directory tree structure /grid/<VO_name>/ <you create it>

slide-13
SLIDE 13

13

LFC Catalog commands

Add/replace a comment lfc-setcomment Set file/directory access control lists lfc-setacl Remove a file/directory lfc-rm Rename a file/directory lfc-rename Create a directory lfc-mkdir List file/directory entries in a directory lfc-ls Make a symbolic link to a file/directory lfc-ln Get file/directory access control lists lfc-getacl Delete the comment associated with the file/directory lfc-delcomment Change owner and group of the LFC file-directory lfc-chown Change access mode of the LFC file/directory lfc-chmod

Summary of the LFC Catalog commands

slide-14
SLIDE 14

14

File Transfer Service

  • FTS is a low level data movement service
  • Why is it needed?

– Improves reliability for transfers – Provides asynchronous file transfer

  • schedule transfers when resources are available

– Provides control of transfer properties (channel concept)

slide-15
SLIDE 15

15

FTS Concepts

  • Transfer Job

– A set of source/destination pairs specifying files to transfer – Submitted to FTS for processing

  • Channel

– A job is assigned to a channel after submission – Represents a point-to-point network link – Catch all channels are possible: any-to-me, me-to-any – Similar to a queue where you can specify

  • VO share for the queue
  • Number of concurrent file transfer
  • Number of concurrent streams (gridFTP)
slide-16
SLIDE 16

16

FTS architecture

  • All components are decoupled

from each other

– Each interacts only with the database

  • Experiments interact via

web-service – User: FileTransfer – Admin: ChannelManagement

  • VO agents assigns jobs to

channels

  • Channel agents manages

assigned file transfers

  • Monitoring and statistics

can be collected via the DB

slide-17
SLIDE 17

ThankYou

17