Outline Storage local/mounted on Compute Elements $OSG_APP, - - PowerPoint PPT Presentation

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Storage local/mounted on Compute Elements $OSG_APP, - - PowerPoint PPT Presentation

Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July 26-27, 2007 Fermi National Accelerator Laboratory Open Science Grid Consortium Outline


slide-1
SLIDE 1

Open Science Grid Consortium

Storage on Open Science Grid

Placing, Using and Retrieving Data

  • n OSG Resources

Abhishek Singh Rana

OSG Users Meeting July 26-27, 2007 Fermi National Accelerator Laboratory

slide-2
SLIDE 2

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 1

Open Science Grid Consortium

Outline

  • Storage local/mounted on Compute Elements

– $OSG_APP, $OSG_WN_TMP, $OSG_DATA

  • Condor-G based file transfer
  • SRM based Storage Elements

– SRM-dCache on OSG – DCap in dCache

  • Typical clients
  • Snapshots
  • Squid based caching mechanisms
slide-3
SLIDE 3

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 2

Open Science Grid Consortium

$OSG_APP

  • Write access from GridFTP and fork via

GRAM host.

  • Read-only access from all WN’s via a

mounted filesystem.

  • Intended for a VO to install application

software, to be later accessed by users.

  • Size > ~50 GB per VO.
slide-4
SLIDE 4

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 3

Open Science Grid Consortium

$OSG_WN_TMP

  • Specific to each batch slot.
  • Local filesystem during job’s execution.
  • Read/write access from a job.
  • Generally cleaned up at the end of

batch slot lease.

  • Size ~ 15 GB per batch slot.
slide-5
SLIDE 5

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 4

Open Science Grid Consortium

$OSG_DATA

  • Read/write access from GridFTP and fork at

GRAM host.

  • Read/write access from batch slot.
  • Intended as stage-in/stage-out area for a job.
  • Persistent across job boundaries.
  • No quotas or guaranteed space.
  • Its usage is discouraged because it led to

complications in past.

slide-6
SLIDE 6

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 5

Open Science Grid Consortium

Condor-G based file transfer

  • Condor-G JDL using ‘transfer_input_files’ and

‘transfer_output_files’.

  • Transfers get spooled via the CE headnode, severely
  • verloading it if filesize is large.
  • Its usage is discouraged for files larger than a few

MB’s.

  • GB size files should be stored in the dedicated stage-
  • ut spaces, and pulled from outside rather than

spooled via the CE headnode by condor file transfer.

slide-7
SLIDE 7

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 6

Open Science Grid Consortium

SRM

  • Storage Resource Management.
  • SRM is a specification - a ‘grid protocol’

formulated by agreements between institutions with very large storage needs, such as LBNL, FNAL, CERN, etc.

  • v1.x in usage, v2.x in implementation/usage.
  • Many interoperable implementations!
  • A user needs to get familiar with only the

client-side software suite of SRM.

– E.g., ‘srmcp’ - it is easy to use!

slide-8
SLIDE 8

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 7

Open Science Grid Consortium

SRM-dCache

SRM server GridFTP server GridFTP server DCap server Pool Pool Pool Pool

Example of site architecture with pools behind NAT

slide-9
SLIDE 9

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 8

Open Science Grid Consortium

SRM-dCache

SRM server GridFTP server GridFTP server DCap server Pool Pool Pool Pool PNFS Logical namespace Physical namespace

Example of site architecture with pools behind NAT

slide-10
SLIDE 10

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 9

Open Science Grid Consortium

SRM-dCache

SRM server GridFTP server GridFTP server DCap server Pool Pool Pool Pool PNFS Logical namespace Physical namespace GSI role-based security

Example of site architecture with pools behind NAT

slide-11
SLIDE 11

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 10

Open Science Grid Consortium

SRM-dCache on OSG

  • Packaged ready-to-deploy as part of VDT.
  • Intended for large scale grid storage.
  • Scheduling, load balancing, fault tolerance.
  • GSI and role-based secure access.
  • Implicit space reservation available.
  • Transfer types:

– Localhost <---> SRM – SRM <---> SRM

  • Widely deployed at production scale on OSG.

– More than 12 sites with ~100 TB each.

slide-12
SLIDE 12

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 11

Open Science Grid Consortium

SRM-dCache on OSG

  • Usage strategy (short term)

– Your own SRM.

  • Have your own VO SRM server at your home site.

(However, it requires deploying and operating SRM- dCache which may be non-trivial).

  • Stage-in/stage-out using SRM client tools.
  • Use ‘srmcp’ installed on WNs of all OSG sites to stage-
  • ut files from a WN to home site.

– Opportunistic access on other sites with SRM.

  • Negotiate access for your VO (and users) with sites

where SRM servers are already deployed.

  • Use ‘srmcp’ installed on WNs of all OSG sites to stage-
  • ut files from a WN to remote sites with SRM.
slide-13
SLIDE 13

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 12

Open Science Grid Consortium

SRM-dCache on OSG

  • Usage strategy (long term)

– ‘Leased’ storage of many TB’s of space for several weeks to months at a time. – Explicit space reservation. – Expected to be in OSG 1.0.0.

slide-14
SLIDE 14

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 13

Open Science Grid Consortium

DCap in dCache

  • Has both server/client components.
  • A user uses ‘dccp’ to read data already

in dCache at the local site.

  • Libraries and client API available

(libdcap.so) and can be integrated/used within applications. Provides a set of POSIX-like functions.

slide-15
SLIDE 15

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 14

Open Science Grid Consortium

Typical clients

  • Command-line read/write client tools.

– Usual arguments are src_URL and dest_URL – globus-url-copy (gsiftp://GridFTP_server:port) – srmcp (srm://SRM_server:port) – srm-ls, srm-rm, srm-mv, srm-mkdir, … – dccp (dcap://DCap_server:port)

  • Interactive tool.

– uberftp

slide-16
SLIDE 16

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 15

Open Science Grid Consortium

Snapshot: Legacy storage

  • Data is written to OSG_DATA using GridFTP or if

necessary by fork jobs (unpack tarballs, etc.).

  • Job is staged into the cluster.
  • Job copies its data to the worker node (OSG_WN_TMP)
  • r reads data sequentially from OSG_DATA (if the data is

read once). The latter can be a significant performance issue on typical network file systems.

  • Job output is placed in OSG_WN_TMP.
  • At the end of job, results from OSG_WN_TMP are

packaged, staged to OSG_DATA and picked up using GridFTP.

slide-17
SLIDE 17

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 16

Open Science Grid Consortium

Snapshot: Legacy storage

WN OSG_WN_TMP WN OSG_WN_TMP WN OSG_WN_TMP OSG_APP OSG_DATA

slide-18
SLIDE 18

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 17

Open Science Grid Consortium

Snapshot: Legacy storage

WN OSG_WN_TMP OSG_APP OSG_DATA

slide-19
SLIDE 19

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 18

Open Science Grid Consortium

Snapshot: Legacy storage

WN OSG_WN_TMP OSG_APP OSG_DATA

1 2 3 4

slide-20
SLIDE 20

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 19

Open Science Grid Consortium

Snapshot: SRM storage

  • Read access is usually by DCap (or SRM), write

access is by SRM.

  • Data is written to SRM-dCache.
  • Job is staged into the cluster.
  • Job execution (open/seek/read using DCap).
  • Job output is placed in OSG_WN_TMP.
  • At the end of job, results from OSG_WN_TMP

are packaged and staged-out using SRM.

slide-21
SLIDE 21

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 20

Open Science Grid Consortium

Snapshot: SRM storage

WN OSG_WN_TMP OSG_APP SRM DCap

slide-22
SLIDE 22

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 21

Open Science Grid Consortium

Snapshot: SRM storage

WN OSG_WN_TMP OSG_APP SRM DCap

1 2 3 4

slide-23
SLIDE 23

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 22

Open Science Grid Consortium

Squid based caching

  • Intended to provide read-only http

caching mechanisms.

  • Used by CMS and CDF.
  • (Details in Dave Dykstra’s talk).
slide-24
SLIDE 24

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 23

Open Science Grid Consortium

  • OSG has vigorously supported distributed

storage since early days.

  • Legacy mechanisms with local or mounted

access to data have been in common use on

  • OSG. As expected, a few limitations exist.
  • SRM based storage is widely available now.

– SRM deployments with total space ~O(1000 TB). A fraction of this space on OSG is available for

  • pportunistic usage by all interested VO’s.

– SRM clients available on WN’s of all OSG sites. – Easy to use!

Summary

slide-25
SLIDE 25

Abhishek Singh Rana OSG Users Meeting, July 26-27 2007, Fermilab 24

Open Science Grid Consortium

Contacts

  • Please email us for more information:

– OSG Storage Team

  • sg-storage@opensciencegrid.org

– OSG User Group Team

  • sg-user-group@opensciencegrid.org