gLite Data Management Agenda gLite Data Management Introduction - - PowerPoint PPT Presentation
gLite Data Management Agenda gLite Data Management Introduction - - PowerPoint PPT Presentation
gLite Data Management Agenda gLite Data Management Introduction Examples Name Convention Storage Elements LCG File Catalog FTS Overview 2 Data Management System (DMS) Provides file manipulation services for users
2
Agenda
- gLite Data Management
– Introduction – Examples – Name Convention – Storage Elements – LCG File Catalog
- FTS Overview
3
Data Management System (DMS)
- Provides file manipulation services for users and other
Grid services.
- DMS enables the location, access and transfer of data
– User do not need to know data location, just the logical name – Data is accessed through standard interfaces – Data can be replicated or transferred to several locations as needed – Data is shared within a VO
4
Scope of data services in gLite
- Simply, DMS provides all operation that all of us are used
to performing
- Uploading /downloading files
- Creating file /directories
- Renaming file /directories
- Deleting file /directories
- Moving file /directories
- Listing directories
- Creating symbolic links
- Note: Files are write-once, read-many
– Files cannot be changed unless remove or replaced – No intention of providing a global file management system
5
Data Issues and Grid Solutions
- Resource centers have growing demand for storage
– Storage Element capable to manage multiple disk pools
- Disk Pool Manager (DPM), dCache, CASTOR
- Data is stored on different storage systems technologies
– Common interface required to hide underlying complexity
- Storage Resource Manager (SRM) – storage management protocol
- GridFTP – secure file transfer
- Data is stored at different locations with separate namespace
– File catalogue to provide uniform view of Grid data
- LCG File Catalog (LFC)
- Applications need to access Grid data management services
– Data management API
- GFAL
6
Data management example
Resource Broker Storage Element Computing Element
DataSets info Input “sandbox” Output “sandbox”
“User interface” LCG FileCatalogue (LFC) Storage Element
- File replicated onto 2 SEs
7
Data management example
Storage Element1 “User interface” LCG FileCatalogue (LFC) Storage Element 2
- File replicated onto 2 SEs
“Myfile.dat” Myfile.dat File_on_se1 File_on_se2 guid
8
Data management example
Storage Element1 “User interface” LCG FileCatalogue (LFC) Storage Element2
“Myfile.dat” Myfile.dat “Logical filename” File_on_se1 (“SURL”: site URL) File_on_se2 (“SURL”: site URL) “GUID” Global Unique Identifier
9
Name conventions
- Logical File Name (LFN)
– An alias created by a user to refer to some item of data, e.g.
“lfn:/grid/cms/20030203/run2/track1”
- Globally Unique Identifier (GUID)
– A non-human-readable unique identifier for an item of data, e.g.
“guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6”
- Storage URL (SURL) or Physical File Name (PFN)
– The location of an actual piece of data on a storage system, e.g.
“srm://pcrd24.cern.ch/flatfiles/cms/output10_1” (SRM) “sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” (Classic SE)
- Transport URL (TURL)
– Temporary locator of a replica + access protocol: understood by a SE, e.g.
“rfio://lxshare0209.cern.ch//data/alice/ntuples.dat”
10
Storage Element
- Provides
– Storage space for files – SRM Interface – Transfer protocol (gsiFTP) ~ GSI based FTP server – POSIX-like file access
- Accessed via Grid File Access Layer (GFAL)
- API interface
- To read parts of files too big to copy
- Example is Disk Pool Manager (DPM)
– Scalable management for independent disk pools for sites – Easy to install, configure and manage – Secure remote and local transfer protocols
- GridFTP, secure RFIO
11
LFC Service
- LFC = LCG File Catalogue
– LCG = LHC Compute Grid – LHC = Large Hadron Collider
- Provides
– Mapping between LFN, GUID and SURL – Transactions, Sessions, Bulk queries – Hierarchical namespace, symbolic links – System metadata – single string user metadata
- All members of a given VO have read-write permissions
in their directory
- Commands look like UNIX with “lfc-” in front (often)
12
LFC Continued
- Users primarily access and manage files through “logical
filenames”
- Mapping by the “LFC” catalogue server
Defined by the user LFC Namespace
LFC has a directory tree structure /grid/<VO_name>/ <you create it>
13
LFC Catalog commands
Add/replace a comment lfc-setcomment Set file/directory access control lists lfc-setacl Remove a file/directory lfc-rm Rename a file/directory lfc-rename Create a directory lfc-mkdir List file/directory entries in a directory lfc-ls Make a symbolic link to a file/directory lfc-ln Get file/directory access control lists lfc-getacl Delete the comment associated with the file/directory lfc-delcomment Change owner and group of the LFC file-directory lfc-chown Change access mode of the LFC file/directory lfc-chmod
Summary of the LFC Catalog commands
14
File Transfer Service
- FTS is a low level data movement service
- Why is it needed?
– Improves reliability for transfers – Provides asynchronous file transfer
- schedule transfers when resources are available
– Provides control of transfer properties (channel concept)
15
FTS Concepts
- Transfer Job
– A set of source/destination pairs specifying files to transfer – Submitted to FTS for processing
- Channel
– A job is assigned to a channel after submission – Represents a point-to-point network link – Catch all channels are possible: any-to-me, me-to-any – Similar to a queue where you can specify
- VO share for the queue
- Number of concurrent file transfer
- Number of concurrent streams (gridFTP)
16
FTS architecture
- All components are decoupled
from each other
– Each interacts only with the database
- Experiments interact via
web-service – User: FileTransfer – Admin: ChannelManagement
- VO agents assigns jobs to
channels
- Channel agents manages
assigned file transfers
- Monitoring and statistics
can be collected via the DB
ThankYou
17