Data Systems Modernization (DSM) Project: Development, - - PowerPoint PPT Presentation

data systems modernization dsm project development
SMART_READER_LITE
LIVE PREVIEW

Data Systems Modernization (DSM) Project: Development, - - PowerPoint PPT Presentation

Data Systems Modernization (DSM) Project: Development, Deployment, and Direction Robert Whitten Jr. OLCF/NCCS Computing Complex #2 Peak performance 2.33 PF/s Memory 300 TB Disk bandwidth > 240 GB/s Square feet 5,000 Dept. of


slide-1
SLIDE 1

Data Systems Modernization (DSM) Project: Development, Deployment, and Direction

Robert Whitten Jr.

slide-2
SLIDE 2

2

OLCF/NCCS Computing Complex

Peak performance 1.03 PF/s Memory 132 TB Disk bandwidth > 50 GB/s Square feet 2,300 Power 3 MW

  • Dept. ¡of ¡Energy’s ¡

most ¡powerful ¡computer ¡

Na7onal ¡Oceanic ¡and ¡ ¡ Atmospheric ¡Administra7on’s ¡ ¡ most ¡powerful ¡computer ¡

Jaguar ¡

Peak performance 2.33 PF/s Memory 300 TB Disk bandwidth > 240 GB/s Square feet 5,000 Power 7 MW

Kraken ¡ NOAA ¡Gaea ¡

Peak Performance 1.1 PF/s Memory 248 TB Disk Bandwidth 104 GB/s Square feet 1,600 Power 2.2 MW Na7onal ¡Science ¡ ¡ Founda7on’s ¡most ¡ ¡ powerful ¡computer ¡

#2 ¡ #8 ¡ #32 ¡

slide-3
SLIDE 3

3

What is DSM?

  • Data Systems Modernization (DSM)
  • Software project to consolidate data sinks
  • Business intelligence tool
  • Data warehouse
  • Extract-transform-load (ETL) tool
slide-4
SLIDE 4

4

What is DSM? (cont.)

  • Resource Allocation and Tracking System (RATS)

– Projects, users, and allocations

  • NACS (New Account Creation System)

– System accounts (usernames, file system areas, etc)

  • DowntimeDB

– System status

  • HPSS stats

– Archival usage

slide-5
SLIDE 5

5

What is DSM? (cont.) Components

  • All middle-ware components used combination of:

– MySQL Database – LDAP – Accessor / mutator scripts (Perl, Python, etc)

  • DSM adds:

– ProcessMaker – LDAP Sync Script – Isolation Layer – System Sync Scripts (SSS) – Interface Scripts – LogiXML

slide-6
SLIDE 6

6

RATS ¡ ¡

Jobs ID Log Cycle Servers Jobs Monitor Metascheduler Job Status Admissibility Tester Job Statistics Static Attributes Host Configuration Resource Charges Projects RATS Users

S u b m i t J

  • b

Q u e r y J

  • b

/ R e c e i v e J

  • b

I n f

  • ???

S u b m i t t e d J

  • b

Check Scheduled Job Info/Remove Info Valid ate Char ges Validat e RATS Users Resource Consumption Report Test Job Validity Check Job Validity/Ack Check Machine Availabilty Stats from Consumption

Platform Users

Validate Platform Users

Sch 0 Sch N

...

Resource Status Jobs ID Log Manager

Job ID Registration Update Resources

Scheduled Jobs Dataset Scheduled Jobs Manager

Report Job Charges

Resource Dataset Projects Dataset RATS Users Dataset Platform Users Dataset Job Status Dataset Job Statistics Dataset

Resource Status Dataset

Host Conf Dataset Static Attributes Dataset

slide-7
SLIDE 7

7

NACS ¡Database ¡

NACS ¡Scripts ¡ LDAP ¡ Data ¡Source ¡ Lustre ¡

NACS ¡ ¡

NFS ¡

slide-8
SLIDE 8

8

DowntimeDB

  • Manual entry of downtime information

Down7me ¡ Database ¡

Data ¡Source ¡ Reports ¡

slide-9
SLIDE 9

9

HPSS Stats

  • Data read directly from HPSS metadata

HPSS ¡

Reports ¡

slide-10
SLIDE 10

10

Why DSM?

  • Multiple middle-ware applications used

– To manage allocations (RATS)

  • Projects, Users / PIs, CPU Hours

– To manage user system accounts (NACS) – To track downtime information (DowntimeDB) – To track storage usage (HPSS)

  • Redundant data
  • Inconsistent interfaces
  • Difficult report generation
slide-11
SLIDE 11

11

DSM

  • Combine best features, remove inconsistencies

DSM ¡Database ¡

SSS ¡Views ¡ Report ¡Views ¡

DSM_NACS ¡ Interface ¡Scripts ¡ ProcessMaker ¡ DSM_RATS ¡ LogiXML ¡

slide-12
SLIDE 12

12

ProcessMaker

  • ProcessMaker is open source workflow software solution

– Business process management tool

  • Initially using it for account/project creation
slide-13
SLIDE 13

13

Interface Scripts

  • Developed at ORNL to allow staff to modify user, group,

project, etc. attributes

– Add/remove user – Add/remove user from project – Create project

  • Written in python
  • Plan to migrate to ProcessMaker
slide-14
SLIDE 14

14

LogiXML

  • Business Intelligence Tool
  • Management reports made easy?
slide-15
SLIDE 15

15

When?

  • Phase 1

– Deploy on NOAA systems – No LogiXML – No ProcessMaker – Remote LDAP synchronization – Completed FY11 Q1

  • Phase 2

– Deploy on DOE systems – LogiXML – ProcessMaker – Target FY11 Q4

slide-16
SLIDE 16

16

Future Plans

  • Phase 3

– Expand role of ProcessMaker

  • Added functionality beyond account creation

– RATS has an open source descendent

  • DataMux (available on Source Forge)
  • Replace the current isolation layer with DataMux components

– Consolidate NOAA and DOE instances of DSM

slide-17
SLIDE 17

17

Questions?