SAM Data Management Services Adam Lyon SAMGrid Project Leader - - PowerPoint PPT Presentation

sam data management services
SMART_READER_LITE
LIVE PREVIEW

SAM Data Management Services Adam Lyon SAMGrid Project Leader - - PowerPoint PPT Presentation

SAM Data Management Services Adam Lyon SAMGrid Project Leader CD/REX/PS Leader D SAM is a multi-level system for data management in use by D, CDF and MINOS Adam Lyon / Fermilab CD D / Neutrino Computing Workshop 1 SAM as a Data


slide-1
SLIDE 1

Adam Lyon / Fermilab CD DØ / Neutrino Computing Workshop

SAM Data Management Services

Adam Lyon

SAMGrid Project Leader – CD/REX/PS Leader – DØ

SAM is a multi-level system for data management in use by DØ, CDF and MINOS

1

slide-2
SLIDE 2

SAM as a Data Catalog (all)

Store metadata about files File type, run information, stream names, MC info, … Create datasets (lists of files) based on metadata queries Datasets are “live”, query language simpler than SQL Replica Catalog Maintain list of file locations (including pnfs & SRM locations)

2

slide-3
SLIDE 3

SAM as a data “deliverer” (all, especially CDF, MINOS)

Deliver file URLs upon request (interop with dCache, bluearc & SRM) Throttle deliveries to protect underlying cache/storage systems Move files from storage to cache and cache to cache world wide Track file usage by projects and jobs. Easy creation of recovery jobs

3

slide-4
SLIDE 4

SAM as a cache management system (DØ)

Operate a system of multi-tiered caches (large cache nodes, small cache nodes) Distributed cache for Grid jobs worldwide Complex routing possible

4

slide-5
SLIDE 5

SAM has a “request system” (DØ for MC)

Store metadata for a MC production request Track produced files (real and virtual) and apply metadata Easy queries for MC files

5

slide-6
SLIDE 6

SAM is maintainable and scalable

Tiny operational load at MINOS ~ 0.75 FTE for entire DØ data management system ~ 1.5 FTE for CDF data management system (but also includes dCache monitoring and operations) Low ongoing development (but still improving caching tune, keeping system up to date with latest infrastructure) SAM works with SL 4 and 5 SAM is easily scalable (including database access via DB servers)

6

slide-7
SLIDE 7

! !

!""#

"$%"&%"# "$%"#%"# "$%$"%"# "$%$$%"# "$%$!%"# "$%$'%"# "$%$(%"# "$%$)%"# "$%$*%"# "$%$+%"# "$%$&%"# "$%$#%"# "$%!"%"# "$%!$%"# "$%!!%"# "$%!'%"# "$%!(%"# "$%!)%"# "$%!*%"# "$%!+%"# "$%!&%"# " !" (" *" &" $"" $!" $(" $*" $&" !""

,-.-/012341510/.6/756819.:

;<-2=9->:54$ ;<-2=9->:54!

?@

7

slide-8
SLIDE 8

SAM has client and developer features

Command line, Python, C++ Interfaces Interface SAM with your framework

8

slide-9
SLIDE 9

SAM - you can take what you need

Data catalog Data delivery (real or virtual) Data caching

9

slide-10
SLIDE 10

Items to consider

How much of SAM do you need to satisfy your requirements? Can your needs be consolidated? e.g. Can multiple experiments share a database? Share a SAM installation? Share a cacheing system? How much development is required (by you and us)?

10