HPC File Systems and Storage Irena Johnson University of Notre Dame - - PowerPoint PPT Presentation

hpc file systems and storage
SMART_READER_LITE
LIVE PREVIEW

HPC File Systems and Storage Irena Johnson University of Notre Dame - - PowerPoint PPT Presentation

HPC File Systems and Storage Irena Johnson University of Notre Dame Center for Research Computing HPC (High Performance Computing) Aggregating computer power for higher performance than that of a typical desktop computer/workstation for


slide-1
SLIDE 1

HPC File Systems and Storage

Irena Johnson University of Notre Dame Center for Research Computing

slide-2
SLIDE 2

HPC (High Performance Computing)

  • Aggregating computer power for higher performance than

that of a typical desktop computer/workstation for solving large problems in science, engineering, business

  • Large systems perform calculations
  • Data access is critical

HPC:

  • Compute node
  • Head node
  • File System
  • Storage
  • Networking
slide-3
SLIDE 3

File

Collection of data/information:

  • Document
  • Picture
  • Audio or video stream
  • Application
  • Other collection of data
slide-4
SLIDE 4

Metadata

The information that describes the data

contained in files:

  • Size
  • Date created
  • Date modified
  • Location on disk
  • Permissions (who can view/modify your file)

slide-5
SLIDE 5

File System

Definition from TLDP (The Linux Documentation Project): "On a UNIX system, everything is a file; if something is not a file, it is a process*.”

Many Types of File Systems

  • Not all file systems are equal
  • Designed for different uses
  • Data is organized in different ways
  • Some are faster than others
  • Some are more robust/reliable
  • Some support large storage drives

*Process - a task (a process is started when a program is initiated)

slide-6
SLIDE 6

UNIX/Linux File System

  • Hierarchical file structure

  • Tree-structured file system (upside down tree)

  • Everything starts from the root directory / and expands into

sub-directories and so forth


  • Unlike Windows which uses ‘drives’

slide-7
SLIDE 7

UNIX/Linux File System

/

/boot/ /bin/ /dev/ /etc/ /home/ /lib/

/media/

/mnt/ /opt/ /proc/ /bea/ /ed/ /jen/

slide-8
SLIDE 8

UNIX/Linux File System

Serial File System (Traditional)

  • A single server controls the users and data
  • Can be faster for one user
  • No redundancy
  • Simple

slide-9
SLIDE 9

UNIX/Linux File System

Distributed / Parallel File System

  • Data is spread out across many systems on a network
  • Single shared global namespace
  • Supports multiple users (can be distributed)
  • Supports high bandwidth
  • More storage than on a single system
  • Fault tolerant
  • Reliable
  • Scalable
  • Complex
slide-10
SLIDE 10

Parallel File System

Clients Clients Clients Metadata servers Metadata servers Metadata Storage Devices Parallel Read/Write Management

slide-11
SLIDE 11

Parallel File System

  • Breaks up a data set and distributes (stripes), the blocks to

multiple storage drives (local and/or remote servers).

  • Users do not need to know the physical location of the data

blocks to retrieve a file.

  • Data access is done via a global namespace.
  • A metadata server stores the file name, location, owner,

access permissions.

  • Reads and writes data to distributed storage devices using

multiple I/O paths concurrently.

  • Capacity and bandwidth can be scaled.
  • Storage - high availability, mirroring, replication, snapshots.
slide-12
SLIDE 12

File Systems at CRC

AFS (Andrew File System)

  • Developed in 1982, part of the Andrew project at Carnegie

Mellon University.

  • Named after Andrew Carnegie and Andrew Mellon
  • Client-server architecture
  • Federated file sharing
  • Provides location independence
  • Scalable
  • Secure (Kerberos for authentication and ACL - access control

lists on directories for users and groups)

  • Available for a wide range of heterogeneous systems - UNIX/

Linux, MacOS X, and Microsoft Windows

slide-13
SLIDE 13

File Systems at CRC

Panasas - High Performance Parallel scratch File System

/scratch365

  • Parallel access to data
  • Data is striped across multiple storage nodes, providing

increased capacity and/or performance

  • Concurrent reading and writing (scalable performance to

individual files)

  • Global Namespace - all compute nodes accessing the storage

see the same namespace (same name and pathname); management is done through one system only


slide-14
SLIDE 14

Overview CRC File Systems

Purpose File System Type, Full Name File Access Space Available Aggregated bandwidth (approx.) Globally accessible home and project directories User’s Home Directories AFS - crc.nd.edu /afs/crc.nd.edu/user/first/netid Directly using OpenAFS client (open source) $HOME 100GB - 2TB volume up to 70-85 MB/sec per node - Approximately 200 MB/sec aggregated using multiple nodes Group Directories AFS crc.nd.edu /afs/crc.nd.edu/group/ Directly using OpenAFS client 100GB - 2TB volume Pseudo-temporary File System Panasas High Performance Parallel scratch file system /scratch365/netid Directly using Panasas proprietary pants client 500GB - 1TB 70-90 MB/sec per node with 1 Gb network Local File Systems Node local temporary scratch file system Local disks /scratch (link to /tmp) Directly - shared with

  • ther users on node

R815 - 500GB HP DL160 - d6copt - 100GB IBM/Lenovo nx360M4 -400GB Daccssfe - 5TB

R815-H700 RAIDctrl -250-300 MB/sec HP DL160 -d6copt 50-60 MB/sec IBM/Lenovo 90-100 MB/sec daccssfe - 800-1,000 MB/sec

slide-15
SLIDE 15

RAID

Redundant Array(s) of Inexpensive/Independent Disks

  • Physical disks bound together with hardware or software
  • Used to create larger filesystems out of standard drive technology
  • Configurations optimize cost vs capability

RAID Levels: 0, 1, 3, 4, 5, 6, 0+1, 1+0

  • RAID 0 - striped (performance and capacity)
  • RAID 1 - mirrored (read performance, fault tolerance FT)
  • RAID 5 - striped with distributed parity (performance, capacity, FT N+1)
  • RAID 6 - striped with distributed parity (performance, capacity, FT N+2)

https://searchstorage.techtarget.com/definition/RAID

slide-16
SLIDE 16

Data Storage

  • How information is kept in a digital format that may be

retrieved later

  • Computers/Laptops/Tablets/Smartphones/other devices -

all store data

  • Hard drive/disk/flash drive/SSD (solid state data)/cloud
  • Is not the same as RAM memory

* Hard drive - think long term memory * RAM - think short term memory

slide-17
SLIDE 17

Data Storage Types

  • File-based storage
  • Block-based storage
  • Object-based storage
slide-18
SLIDE 18

File Storage

  • Also called file-level or file-based storage
  • You use file storage when you access documents/pictures

saved in files on your computer

  • Data is stored as a single piece of information inside a file, inside

a directory

  • A single path to data
  • Hierarchical in nature (called tree-structured system)
  • Oldest type of storage
  • Inexpensive
  • Simple
slide-19
SLIDE 19

Block Storage

  • Breaks a file into individual blocks of data
  • The blocks are stored as separate pieces of data
  • No need for file-folder structure because each block of data has

a unique address

  • The smaller blocks of data spread out to where is most efficient
  • The storage system software pulls all the blocks back together

to assemble the file when accessed

  • The more data you need to store, the better
slide-20
SLIDE 20

Block Storage

  • Used in storage-area network (SAN) environments where data is

stored in volumes (blocks)

  • Data is divided into blocks (can be different sizes) which are

stored separately on hard drive(s)

  • Consistent I/O performance, low latency connectivity
  • More expensive, complex
  • Good for data that has to be frequently accessed and updated
  • Usage examples: database storage; applications like Java
slide-21
SLIDE 21

Object Storage

  • Also called object-based storage
  • Files are broken into units called objects and spread out among

hardware

  • The objects are kept in a single repository, instead of being kept

as files in directories or as blocks on servers

  • The blocks of data that make up a file, the metadata is kept into

a storage pool

  • Unique identifier assigned to the object
  • Cost efficient: you only pay for what you use
  • Usage examples: big data, web applications, backup archives
  • Good for data that doesn’t need to be modified (just READ)
slide-22
SLIDE 22

File/Block/Object Storage Comparison

File-based storage Block-based storage Object-based storage Transaction units Files Blocks Objects Protocols CIFS, NFS SCSI, FiberChannel, SATA Web services (XML- based messaging) Metadata File-system attributes File-system attributes Custom metadata Recommended for Shared file data Transactional data, frequently changing data Static file data, cloud storage Strength Simplified access and management

  • f shared files

High performance Scalable, distributed access

slide-23
SLIDE 23

SAN (Storage Area Network)

  • dedicated high-speed network that interconnects and shares

pools of storage devices to multiple servers

  • each server accesses the shared storage as if it were directly

attached to it

  • raw storage is treated as a pool of resources which can be

centrally managed and allocated

  • highly scalable - capacity can be added as needed
  • disadvantages: cost and complexity

Clients Clients Clients Metadata server Storage Network Network

slide-24
SLIDE 24

NAS (Network Attached Storage)

  • dedicated file storage device that provides nodes within same

network file-based storage via Ethernet connection

  • storage appliance, connected to a network switch
  • reliable, flexible
  • highly scalable network storage
  • speed

Clients Clients Clients NAS Storage Network Network

slide-25
SLIDE 25

Panasas - object-based storage cluster

  • Performance

improves with scale - linear scalability

  • Data protection

improves with scale

  • Scalable storage
  • easy to

access, deploy and manage

slide-26
SLIDE 26

Panasas - ActiveStor

  • Parallel scale-out NAS storage appliance
  • Complete hardware and software storage solution
  • Implements:
  • Parallel, object-based filesystem
  • Global namespace
  • Strict client cache coherency
  • Network Attached Storage (NAS) - Panasas DirectFlow (pNFS, CIFS, NFS) - rpm

package for Linux (also MAC OS supported)

  • Scale-out NAS - serving parallel access to data (data is striped across multiple

storage nodes, providing increased capacity and/or performance)

  • Parallel File System - concurrent reading and writing (data for a single file is striped

across multiple storage nodes to provide scalable performance to individual files)

  • Global Namespace - all compute nodes accessing the storage see the same

namespace (same name and pathname); management is done through 1 system only

slide-27
SLIDE 27

Panasas

1 node of Panasas architecture ActiveStor16

Hybrid - Storage media is a combination of HDD and SDD:

  • Hard Drives - for larger files
  • SSD (Flash) - for small files or metadata (no moving parts - “solid” state)

Scalable data solution Redundant power modules - two Redundant battery module (power backup) Redundant switch modules (connected to the network, provide access to storage) Storage blade: 2 HDD, 1 Flash, CPU, RAM Mem Director blade: serves File system metadata, legacy protocol (NFS, CIFS) 1 shelf = 1 Director Blade + 10 Storage Blades

slide-28
SLIDE 28

Panasas

Storage blade: 2 HDD, 1 Flash, CPU, RAM Mem Director blade

  • 1. Serves File System metadata (file’s location, owner,

permissions, size of file)

  • 2. The gateway to the storage for standard legacy protocols

(NFS, CIFS) DirectFlow - direct access between compute clients and storage

slide-29
SLIDE 29

Panasas at CRC

ActiveStor16 - 7 Shelves:

  • 2 Director Blade per shelf
  • 9 Storage Blades per shelf

Total Director Blades: 14 Total Storage Blades: 63 Total Capacity 771.12 TB Data space Used 296.02 TB Metadata Used 8.59 TB Free Space 437.04 TB