HPSS Treefrog Introduction HUF 2017 - PowerPoint PPT Presentation

HPSS Treefrog Introduction HUF 2017 http://www.hpss-collaboration.org

Disclaimer Forward looking information including schedules and future software reflect current planning that may change and should not be taken as commitments by IBM or the other members of the HPSS collaboration. http://www.hpss-collaboration.org 2

HPSS Treefrog Goals Manage and share data across the life of your mission’s projects, procurements, infrastructure, deployment, user access, and staffing cycles. Store, protect, and error correct project data across a wide variety of local and remote classic and cloud storage products and services. Effectively exploit and scale tape and other high latency storage by using data containers to group and store files and data objects! http://www.hpss-collaboration.org 3

A Single User Namespace Managed across industry storage devices and solutions called storage endpoints: § Cloud § HSMs including HPSS § Optical § Tape § File system § Disk § SSD Managed across data repositories § Storage endpoints provide real storage for data repositories. § Repositories are wholly contained inside a storage endpoint. http://www.hpss-collaboration.org 4

Manage Data by Project § Projects provide the nexus between data management and data organization. § Administrators manage project policies including § s torage quotas § Storage access § Service limits § Access authorization § Users store data within the projects and group data within data containers (called managed data sets) § Data are share amongst project members (allowed users) § Project members will have different roles: § Owner, reader, writer, modify, delete § Data will be owned by the project. § Insures data will always have an owner. § Allows for easy on and off boarded of users. http://www.hpss-collaboration.org 5

Policy Defined Storage Management § Policies determine how and where data are stored. § Make multiple copies of data: § At ingest from the golden copy § After a delay from a managed copy § Control data recall: § Assign primary recall copy § Assign failover copies § Block recall of copies from storage endpoint requiring administrator authorization http://www.hpss-collaboration.org 6

Smart Data Storage § Manage data containers not individual data objects and files. § Grouped data will be stored as an immutable collection of files or objects called a managed data set. § As a bonus, grouping data benefits high latency storage. § Decreases the number of tape syncs. § Allows for all data to be recalled with a single IO. § Data will be grouped into date sets using a data retention format. § The Treefrog interface will make grouping data simple. http://www.hpss-collaboration.org 7

Parallel Data Transfer § Managed Data Sets may be broken into smaller Repository 1 Manifest Transfer Huge Dataset fragments. Object Fragment Fragment #1 #1 § Based on storage policy settings. Huge Object Transfer § Fragments are contiguous sections Repository 2 Huge Dataset of Treefrog managed data set that Object Fragment are distributed across repositories. Fragment #2 #2 Large § Maximum degree of Object Transfer Repository 3 parallelism will be based on Small Dataset Large Fragment Object Small #3 configuration. Small Small http://www.hpss-collaboration.org 8

Data Redundancy via Erasure Coding § Parity fragments will be generated based on storage Repository 1 Manifest Transfer policy settings. Huge Dataset Object Fragment Fragment #1 § The number of fragments that may #1 Huge be recovered will be based on the Object Transfer number of parity fragments created. Repository 2 Huge Dataset Object Fragment Fragment #2 #2 Large Object Transfer Repository 3 Small Dataset Large Fragment Object Small #3 Small Small Repository 4 Dataset Parity Fragment http://www.hpss-collaboration.org 9

More About Storage Policies § A copy of a data set may First copy be: Manifest Transfer Repository Dataset Fragment Fragment § Stored to a single repository #1 #1 Huge § Fragmented to a single repository Object Transfer Repository Dataset Fragment § Fragmented across multiple Fragment #2 #2 repositories Large Transfer Object § Changing storage policies Repository Large Dataset Object Fragment Small #3 only moves data when Small Small Small Repository required. Dataset Parity Fragment Second Copy Repository http://www.hpss-collaboration.org 10

Simple Insertion of New Storage Endpoints § Copy agent based on Apache Jclouds Blobstore. § Copy agent interface will be extensible. § AWS, Google Cloud Storage, Azure, and Rackspace already supported. § HPSS interface is planned. § Adding a storage endpoint will be as simple as adding a new Jclouds interface. http://www.hpss-collaboration.org 11

Data and Metadata Verification § Each fragment will be stored with with a checksum. § Treefrog can verify both the metadata and data of managed data sets. § Administrators use storage policies to control the verification settings and subsequent overhead. § Metadata Verification will verify the location, checksum, and size of each fragment in the repository match the value Treefrog has stored. § Metadata Verification will not access the data. § Data Verification will verify the checksum of each fragment. § Data Verification may access the data. § Treefrog will use the built in verification on storage systems that have it. § Treefrog will stage fragments to verify checksum. http://www.hpss-collaboration.org 12

All of that in an Extreme Scale Architecture § Scale-out design allows incremental horizontal growth by adding new servers and devices. § Load Balancing using HAProxy. § Agents may run at the client to take advantage of available processing power and reduce store and forwards. http://www.hpss-collaboration.org 13

But wait there’s more!!! In addition HPSS Treefrog will: § Decrease software development delivery time. § Decrease software deployment time. § Enable user installation. § Increase timely access to trending technology. § Increase use of trending programming language skills and open software. § Avoid impact to on-going HPSS core services development. http://www.hpss-collaboration.org 14

Treefrog will be an HPSS Interface Spectrum Scale Spectrum Scale FUSE Parallel HPSS Client API for 3 rd party applications HPSS Treefrog SwiftOnHPSS Interface Filesystem FTP Client API interface & services Massively scalable global HPSS namespace enabled by DB2 RHEL Core Server & Mover computers Intel Power Extreme-scale high-performance automated HSM Disk Tape Cloud, Object Hardware & File Vendor Storage and Neutral Block or Filesystem Disk Tiers Services IBM Oracle Spectra Logic Enterprise LTO Tape including LTFS http://www.hpss-collaboration.org 15

Treefrog will use Existing Technologies Existing Products § Only configuration changes are required Extendable Functionality § Open Source code or library Treefrog Specific Code § Code specific to the Treefrog application § Requires from-scratch development http://www.hpss-collaboration.org 16

Treefrog will use Existing Technologies http://www.hpss-collaboration.org 17

Questions? http://www.hpss-collaboration.org 18

HPSS Treefrog Introduction HUF 2017 - PowerPoint PPT Presentation

HPSS Treefrog Introduction HUF 2017 http://www.hpss-collaboration.org Disclaimer Forward looking information including schedules and future software reflect current planning that may change and should not be taken as commitments by IBM or the

The 2017 HPSS User Forum! Jim A. Gerry Senior IT Architect and Consultant 1

Following Strategies Reduces Following Strategies Reduces Accidents, Accidents, but Makes

Migrating Files from HPSS Brian Vanderwende CISL Consulting Services December 4, 2019 This

INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION

Introduction ATV Introduction A T V Introduction A lphabet T V Introduction A lphabet

Brief Brief Introduction Introduction Brief Brief Introduction Introduction Zhengzhou

Brief Brief Introduction Introduction Brief Brief Introduction Introduction Zhengzhou

Shenzhen Cuilu jewelry Co., Ltd was founded in 1996 and its a large private enterprise

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Spectrum Painting Richard Shipman MW0RCZ ADARS 6th Jan 2020 Introduction Introduction

Introduction Introduction Introduction Introduction Outline Motivation Failures

Introduction Introduction Introduction Nationwide Cause for Concern 1

Team Introduction Experiments Outreach Problem Project Brainstorm Introduction Introduction

Lecture 1 Andreas Habegger Introduction Zynq Introduction Zynq Introduction Zynq PS vs. PL

Introduction to Web Design & Computer Principles Class 1 CSCI-UA 4 Introduction and Overview

Introduction to CICS Course introduction Course introduction What is CICS? What is an

Opportunity Assessment Launching New Products and Businesses What is Entrepreneurship? The

Chapel Cray Cascades High Productivity Language Mary Beth Hribar Steven Deitz Brad

LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon ABOUT ME

Jude Series Lesson #014 September 13, 2012 Dean Bible Ministries www.deanbible.org Dr. Robert

Applications of vector-valued modular forms Cameron Franc (joint w. Geoff Mason) University of

A Small Reflection On Group Automorphisms Franc ois Garillot Mathematical Components

min L ( ) - GANs: Hard (different) optimization problem: minimax. Gauthier Gidel

Test automation / JUnit Building automatically repeatable test suites JUnit in Eclipse For

Sambuz

Useful Links

Newsletter

Mail Us

HPSS Treefrog Introduction HUF 2017 - PowerPoint PPT Presentation

HPSS Treefrog Introduction HUF 2017 http://www.hpss-collaboration.org Disclaimer Forward looking information including schedules and future software reflect current planning that may change and should not be taken as commitments by IBM or the

The 2017 HPSS User Forum! Jim A. Gerry Senior IT Architect and Consultant 1

Following Strategies Reduces Following Strategies Reduces Accidents, Accidents, but Makes

Migrating Files from HPSS Brian Vanderwende CISL Consulting Services December 4, 2019 This

INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION

Introduction ATV Introduction A T V Introduction A lphabet T V Introduction A lphabet

Brief Brief Introduction Introduction Brief Brief Introduction Introduction Zhengzhou

Brief Brief Introduction Introduction Brief Brief Introduction Introduction Zhengzhou

Shenzhen Cuilu jewelry Co., Ltd was founded in 1996 and its a large private enterprise

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Spectrum Painting Richard Shipman MW0RCZ ADARS 6th Jan 2020 Introduction Introduction

Introduction Introduction Introduction Introduction Outline Motivation Failures

Introduction Introduction Introduction Nationwide Cause for Concern 1

Team Introduction Experiments Outreach Problem Project Brainstorm Introduction Introduction

Lecture 1 Andreas Habegger Introduction Zynq Introduction Zynq Introduction Zynq PS vs. PL

Introduction to Web Design &amp; Computer Principles Class 1 CSCI-UA 4 Introduction and Overview

Introduction to CICS Course introduction Course introduction What is CICS? What is an

Opportunity Assessment Launching New Products and Businesses What is Entrepreneurship? The

Chapel Cray Cascades High Productivity Language Mary Beth Hribar Steven Deitz Brad

LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon ABOUT ME

Jude Series Lesson #014 September 13, 2012 Dean Bible Ministries www.deanbible.org Dr. Robert

Applications of vector-valued modular forms Cameron Franc (joint w. Geoff Mason) University of

A Small Reflection On Group Automorphisms Franc ois Garillot Mathematical Components

min L ( ) - GANs: Hard (different) optimization problem: minimax. Gauthier Gidel

Test automation / JUnit Building automatically repeatable test suites JUnit in Eclipse For

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Web Design & Computer Principles Class 1 CSCI-UA 4 Introduction and Overview