Parallel IO These slides are possible thanks to these sources - PowerPoint PPT Presentation

Parallel IO These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Parallel I/O Tutorial, Argonne National Labs: HPC I/O for Computational Scientists,TACC/Cornell MPI/IO Tutorial, NeRSC Lustre Notes; Quincey Koziol – HDF Group nci.org.au nci.org.au @NCInews

References • eBook: High Performance Parallel I/O – Chapter 8: Lustre – Chapter 13: MPI/IO – Chapter 15: HDF5 • HPC I/O For Computational Scientists (YouTube); Slides • Parallel IO Basics - Paper • eBook: Memory Systems - Cache, DRAM, Disk – Bruce Jacob nci.org.au

The Advent of Big Data • Big Data refers to datasets and flows large enough that have outpaced our capability to store , process , analyze and understand – Increase in computing power makes simulations larger and more frequent – Increase in sensor technology resolution creates larger observation data points • Data sizes that once used to be measured in MBs or GBs now measured in TBs or PBs • Easier to generate the data than to store it nci.org.au

The Four V’s nci.org.au http://www.ibmbigdatahub.com/infographic/four-vs-big-data

BIG DATA PROJECTS AT THE NCI nci.org.au

ESA’s Sentinel Constellation Sentinel-1 observation scenario and impact on data volumes • Sentinel-1 systematic observation scenario in one/two main high rate modes of operation will result in significanlty large acquisition segments (data takes of few minutes) • 25min in high rate modes leads to about 2.4 TBytes of compressed raw 2 min IW data per day for the 2 satellites 16 GB for SLC 4 GB for GRD-HR • Wave Mode operated continuously over ocean where high rate modes are not used 46 GB for SLC 12 GB for GRD-HR 6 min IW 15 min IW nci.org.au We care for a safer world

ESA’s Sentinel Constellation Sentinel-1 observation scenario and impact on data volumes • Sentinel-1 systematic observation scenario in one/two main high rate modes of operation will result in significanlty large acquisition segments (data takes of few minutes) • 25min in high rate modes leads to about 2.4 TBytes of compressed raw Sentinel-1s 2.4TB/day 2 min IW data per day for the 2 satellites 16 GB for SLC Sentinel-2s 1.6TB/day (High-Res Optical Land Monitoring) 4 GB for GRD-HR Sentinel-3s providing 0.6 TB/day (Land+Marine Observation) • Wave Mode operated continuously over ocean where high rate modes are not used 46 GB for SLC 12 GB for GRD-HR 6 min IW 15 min IW nci.org.au We care for a safer world

Nepal Earthquake Inteferogram using Sentinel SAR Data nci.org.au

Data Storage at NCI nci.org.au nci.org.au @NCInews

Data Storage Subsystems at the NCI • The NCI compute and data environments allow researchers the ability to seamlessly work with HPC and Cloud based compute cycles, while have unified data storage • How is this done? nci.org.au

NCI’s integrated high-performance environment Internet Raijin Login + Data NCI data VMware Cloud movers movers Raijin HPC Compute 10 GigE To Huxley DC /g/data 56Gb FDR IB Fabric Raijin 56Gb FDR IB Fabric Persistent global parallel Raijin high-speed Massdata (tape) filesystem filesystem /g/data1 /g/data2 /short /home, /system, Cache 1.0PB, /images, /apps ~6.3PB ~3.1PB 7.6PB Tape 12.3PB nci.org.au

HARDWARE TRENDS nci.org.au

Disk and CPU Performance Disk (MB/s), CPU (MIPS) Di “ ma techno Tho Bes Ha ! M Wi “ Seco HPCS2012 nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

Disk and CPU Performance Disk (MB/s), CPU (MIPS) 1000x Di “ ma techno Tho Bes Ha ! M Wi “ Seco HPCS2012 nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

Memory and Storage Latency nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

Assessing Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() nci.org.au

Assessing Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads Lab – measuring MB/s and IOPS • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() nci.org.au

Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() Device Bandwidth (MB/s) IOPS SATA HDD 100 100 SSD 250 10000 HD: ! Open, Write, Close 1000x1kB files: 30.01s (eff: 0.033 MB/s) ! Open, Write, Close 1x1MB file: 40ms (eff: 25 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads SSDs better at IOPS • IOPS – IO Operations Per Second – no moving parts – open(), close(), seek(), read(), write() Latency at controller, system calls etc. SSDs are still very Device Bandwidth (MB/s) IOPS expensive. Disk to stay! SATA HDD 100 100 SSD 250 10000 SSD: ! Open, Write, Close 1000x1kB files: 300ms (eff: 3.3 MB/s) ! Open, Write, Close 1x1MB file: 4ms (eff: 232 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads SSDs better at IOPS • IOPS – IO Operations Per Second – no moving parts Raijin – open(), close(), seek(), read(), write() Latency at controller, /short – aggregate – 150GB/sec (writes), 120GB/sec (reads) system calls etc. 5 DDN SFA12K arrays for /short, each is capable of SSDs are still very 1.3M read IOPS; 700,000 write IOPS yielding a total Device Bandwidth (MB/s) IOPS expensive. Disk to of 6.5M read IOPS and 3.5M write IOPS stay! SATA HDD 100 100 SSD 250 10000 SSD: ! Open, Write, Close 1000x1kB files: 300ms (eff: 3.3 MB/s) ! Open, Write, Close 1x1MB file: 4ms (eff: 232 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

The Linux Storage Stack Diagram Fibre Channel over Ethernet Fibre Channel Virtual Host version 4.10, 2017-03-10 FireWire outlines the Linux storage stack as of Kernel version 4.10 ISCSI USB mmap (anonymous pages) Applications (processes) LIO malloc iscsi_target_mod tcm_usb_gadget chmod(2) open(2) write(2) read(2) stat(2) vfs_writev, vfs_readv, ... tcm_qla2xxx ... sbp_target tcm_vhost tcm_fc VFS Block-based FS Network FS Pseudo FS Special ext2 ext3 ext4 xfs NFS coda purpose FS proc Direct I/O sysfs Page target_core_mod ... (O_DIRECT) btrfs ifs iso9660 smbfs tmpfs ramfs cache pipefs futexfs ... target_core_ fi le gfs ocfs ceph ... devtmpfs usbfs Stackable FS target_core_iblock FUSE ecryptfs overlayfs unionfs userspace (e.g. sshfs) target_core_pscsi target_core_user network (optional) stackable struct bio - sector on disk BIOs (block I/Os) Devices on top of “normal” BIOs (block I/Os) - sector cnt - bio_vec cnt block devices LVM drbd - bio_vec index device mapper - bio_vec list mdraid dm-crypt dm-mirror ... bcache dm-cache dm-thin dm-raid dm-delay userspace BIOs BIOs BIOs Block Layer I/O scheduler blkmq Maps BIOs to requests multi queue hooked in device drivers noop Software (they hook in like stacked ... queues cfq devices do) deadline Hardware Hardware ... dispatch dispatch queues queue BIO Request Request based drivers based drivers based drivers Request-based device mapper targets dm-multipath SCSI mid layer sysfs scsi-mq /dev/zram* /dev/rbd* /dev/mmcblk*p* /dev/nullb* /dev/vd* /dev/rssd* /dev/skd* (transport attributes) SCSI upper level drivers ... /dev/ubiblock* /dev/nbd* /dev/loop* /dev/nvme*n* /dev/sda /dev/sd* /dev/rsxx* Transport classes scsi_transport_fc /dev/st* /dev/sr* zram ubi rbd nbd mmc loop null_blk virtio_blk mtip32xx nvme skd rsxx scsi_transport_sas scsi_transport_... network memory SCSI low level drivers megaraid_sas qla2xxx pm8001 iscsi_tcp virtio_scsi ... libata ufs ... ata_piix mpt3sas vmw_pvscsi ahci aacraid lpfc network HDD SSD DVD LSI Qlogic PMC-Sierra para-virtualized Micron nvme stec mobile device virtio_pci device device drive RAID HBA HBA SCSI PCIe card nci.org.au fl ash memory Adaptec Emulex LSI 12Gbs VMware's SD-/MMC-Card IBM fl ash RAID HBA SAS HBA para-virtualized adapter SCSI Physical devices

Parallel IO These slides are possible thanks to these sources - PowerPoint PPT Presentation

Parallel IO These slides are possible thanks to these sources Jonathan Drusi - SCInet Toronto Parallel I/O Tutorial, Argonne National Labs: HPC I/O for Computational Scientists,TACC/Cornell MPI/IO Tutorial, NeRSC Lustre Notes; Quincey

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Overview Why Parallel Sorting? Parallel Quicksort Bitonic Sort Parallel Merge Sort

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive

How to Think Algorithmically in Parallel? Or, Parallel Programming through Parallel Algorithms

PARALLEL Joachim Nitschke PROGRAMMING Project Seminar Parallel Programming, Summer

The Parallel Revolution Has Started: Are You Part of the Solution or Part of the Problem? Dave

Introduction Nima Honarmand Fall 2015 :: CSE 610 Parallel Computer Architectures Parallel

Cluster Basics Hana Sevcikova University of Washington DataCamp Parallel Programming in R

2110412 Parallel Comp Arch Parallel Programming Paradigm Natawut Nupairoj, Ph.D. Department of

The Green Computing Observatory Michel Jouvin (LAL) Ccile Germain-Renaud (LRI), Thibaut Jacob

On the Design of Fault-Tolerance in a Decentralized Software Platform for Power Systems Purboday

Earthquake Hazard and Risk Assessment and Water-Induced Landslide Hazard in Benton County,

and Extreme Scale Research Computing D. Karres, Beckman Institute J. Alameda, National Center

Microfluidics techniques to design encapsulated ingredients F F abrizio Sarghini abrizio

The Painful Sacroiliac Joint None. M Y T H S , D O G M A , A N D T H E E V I D E N C E ALAN

11/8/2013 INDICATIONS Poor sacral fixation SACROPELVIC FIXATION: Long construct above

Frank Lloyd Wright He was an American architect who rejected the reliance on historical design

Sambuz

Useful Links

Newsletter

Mail Us