Data-Intensive Applications on Numerically-Intensive Supercomputers - - PowerPoint PPT Presentation

data intensive applications on numerically intensive
SMART_READER_LITE
LIVE PREVIEW

Data-Intensive Applications on Numerically-Intensive Supercomputers - - PowerPoint PPT Presentation

Data-Intensive Applications on Numerically-Intensive Supercomputers David Daniel / James Ahrens Los Alamos National Laboratory July 2009 Operated by the Los Alamos National Security, LLC for the DOE/NNSA Interactive visualization of a


slide-1
SLIDE 1

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Data-Intensive Applications

  • n

Numerically-Intensive Supercomputers

David Daniel / James Ahrens Los Alamos National Laboratory July 2009

slide-2
SLIDE 2

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Interactive visualization of a billion-cell plasma physics simulation

slide-3
SLIDE 3

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

VPIC – A case study of visualization on a supercomputing platform

  • Goal: Running simulation on 4096 ASC Roadrunner

processors

– Computing an 8192x512x512 problem = 2 billion cell

problem

– VPIC per processor dump files = 100s of GB per time

step

  • Data reduction, prioritization and visualization on the

supercomputer combines will help this science team be successful

slide-4
SLIDE 4

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

The RoadRunner Universe project

  • First Petascale Cosmology

Simulations

– New scalable hybrid code designed for heterogeneous architectures – New algorithmic ideas for high performance

  • Domain overloading with

particle caches

  • Digital filtering to reduce

communication across Opteron/Cell layer

  • >50 times speed-up over

conventional codes

  • The RoadRunner Universe

Data Challenge

– Individual trillion particle runs generate 100s of TB of raw data

  • Must carry out “on the fly”

analysis

– KD tree-based halo finder parallelized with particle

  • verloading
  • June 2009-4
slide-5
SLIDE 5

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Visualization tools provide a high-level data-oriented programming model

  • Visualization tools are

programmable

– Uses a data-flow program graph…

  • Visualization tools provide

their own run-time system

  • Optimize access to

numerically-intensive architecture

– Multi-resolution out-of-core data visualization

slide-6
SLIDE 6

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Multi-resolution out-of-core visualization

slide-7
SLIDE 7

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Multi-resolution out-of-core visualization

slide-8
SLIDE 8

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Impact: Using a data-oriented programming model we can directly address the fundamental massive scale visualization challenge

  • Supports an approach for the massive data visualization

problem

  • Data reduction, prioritization, multi-resolution and out-of-core processing

1.14 0.04 0.84 3.72 0.00 0.20 7.80 0.14 0.82 0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 full data (1 GB) low res data (1 MB) multi-res data (38 MB) seconds

Performance comparison

render

  • ne time mesh
  • ne time read

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 normalized RMS CIELUV error seconds

Time vs. image quality

slide-9
SLIDE 9

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

There is a “Middle Way” between numerically- intensive and data-intensive supercomputing

  • Numerically-intensive supercomputing approach – Massive FLOPS
  • Data-intensive supercomputing (DISC) approach – Massive data
  • We are exploring this “Middle Way” by necessity for interactive

scientific analysis and visualization of massive data

  • DISC using a traditional HPC platform (compare Bryant)

  • 1. Data as first-class citizen
  • In-situ analysis for Roadrunner Universe application

  • 2. High-level data oriented programming model
  • Programmable visualization tools
  • Multi-resolution out-of-core visualization

  • 3. Interactive access – human in the loop
  • Visualization on the supercomputing platform

  • 4. Reliability
slide-10
SLIDE 10

Operated by the Los Alamos National Security, LLC for the DOE/NNSA

Los Alamos Computer Science Symposium (LACSS) October 13, 14, 2009 La Fonda Hotel, Santa Fe, NM http://www.lanl.gov/conferences/lacss/2009/ Non-Traditional Programming Models for High-Performance Computing

This year's LACSS will focus on Data-Intensive Architectures and

  • Applications. The similarities and differences with "traditional" HPC

will be explored. Other non-traditional HPC themes may be added. Attendance open to all. Speakers likely by invitation only but contact Al McPherson <mcpherson@lanl.gov> if you are interested.