08-07337 LA-UR- Approved for public release; distribution is - - PDF document

08 07337
SMART_READER_LITE
LIVE PREVIEW

08-07337 LA-UR- Approved for public release; distribution is - - PDF document

08-07337 LA-UR- Approved for public release; distribution is unlimited. Title: Petascale Visualization: Approaches and Initial Results Author(s): James Ahrens, Li-Ta Lo, Boonthanome Nouanesengsy, John Patchett, Allen McPherson Intended for:


slide-1
SLIDE 1

Form 836 (7/06)

LA-UR-

Approved for public release; distribution is unlimited. Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396. By acceptance

  • f this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the

published form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher’s right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness.

Title: Author(s): Intended for:

08-07337

Petascale Visualization: Approaches and Initial Results James Ahrens, Li-Ta Lo, Boonthanome Nouanesengsy, John Patchett, Allen McPherson Invited Presentation, SOS13 - Sandia, ORNL and the Swiss (SOS) supercomputing meeting, March 2009

slide-2
SLIDE 2

With the advent of the first petascale supercomputer, Los Alamos's Roadrunner, there is a pressing need to address how visualize petascale data. The crux of the petascale visualization performance problem is interactive rendering, since it the most computationally intensive portion of visualization process. At the terascale, commodity clusters with GPUs have been used for interactive rendering. At the petascale, visualization and rendering may be able to run efficiently on the supercomputer platform. In addition to Cell-based supercomputers, such as Roadrunner, we also evaluated rendering performance on multi-core CPU and GPU based processors. To achieve high-performance on multi-core processors, we tested with multi-core optimized ray-tracing engines for rendering. For real- world performance testing and to prepare for petascale visualization tasks we interfaced these rendering engines with vtk and ParaView. Initial results show that rendering software optimized for multi-core CPU and Cell processors provides , competitive performance to GPU clusters, for the parallel rendering of massive

  • data. The current architectural multi-core trend suggests multi-core based

supercomputers are able to provide interactive visualization and rendering support now and in the future.

slide-3
SLIDE 3

Petascale Visualization: Approaches and Initial Results

  • -----------.

~

~

Ja· mes Ahre't1s Visual ization

) T ~ am

Lea1 d Ollie Lo" B

l'oontha'

nome Noua

~ ~

sengsy,

II (

I'

JO'hn Patchett, Allen Ml

cPherso) n

If

"

/

~

Los Alamos National LatY

  • ralory

Dave DeMarle Kitware, Inc.

slide-4
SLIDE 4

Trends lin Petascale

( Sup$r

¢ om

~

ting

e Lots of compute

cycles

  • Multi-core revolution

e Increasing latency

from processor to memory, disk and network

  • Many memory-only

simulation results

e Very expensive

e Can compute

significantly more data than can be saved to disk

OFor example, on RR

e To disk: 1 Gbyte/sec e Compute: 100 Gbytes on a triblade from Cells to Cell memory

slide-5
SLIDE 5

e 1. What data

should be saved from the simulation?

  • 2. What are our
  • ptions for running
  • ur visualization

software?

e Can we run our

visualization software on the supercomputer?

slide-6
SLIDE 6

Cai

n we e~ / fiCi

J: entIY

nJn ourIvisual1 ization

\)

\'

\ f so twate on..-tne super:corhpbJ.ter'P

e The data understanding process is composed of

a number of activities:

  • Analysis and statistics
  • Visualization
  • Map simulation data to a visual representation (i.e geometry)
  • Rendering
  • Map geometry to imagery on the screen

e Already runs on the supercomputer

  • Analysis, statistics and visualization
slide-7
SLIDE 7

Ca'n we i1 nteractivelyrfenqe! ( on"H

ire

su ~col1J

. 0!t

iJ n 9 piatform/

? e Fast rendering for interactive exploration

0 5-10 fps minimum

0 24-30 fps - HDTV 0 60 fps - stereo

e Typically provided by commodity graphics

in a visualization cluster

slide-8
SLIDE 8

Rendering on the supercomputer Disadvantages

  • Cost to port rendering to the

supercomputing platform

  • Allocate portion of

supercomputer to analysis and visualization

Advantages

  • Scalable to supercomputer

.

size

  • Access to "all" simulation

results

Rendering on visualization cluster Disadvantages

  • Cost of cluster and

infrastructure to connect it

  • Less access to data - only

data that is written to disk

Advantages

  • Independent resource

devoted to visualization task

  • Very fast especially on

smaller datasets

slide-9
SLIDE 9

S9t-la9t

( Par

~

lIel

Re

r iderip~

1' of DArge

Da

~ta

e Sort-last parallel rendering algorithms have two

stages:

  • 1. Rendering stage

The processor renders its assigned geometry into a "distance/ depth" buffer and image buffer

0 2. Networking / Compositing stage

  • These image buffers are composited together to create a

complete result

e Given there are two stages the performance is

limited by the slower stage

  • Assuming pipelining of the stages
slide-10
SLIDE 10

Types q~

Re

~

dering

  • 1. OpenGL Software
  • Mesa - open-source
  • 2. OpenGL Hardware
  • Graphics cards - Nvidia
  • Raytracing
  • Better physics model for

the lighting equations

  • Fast multi-core ready

implementations

  • 1. Manta Software
  • Multi-core, open-source (Univ. of

Utah)

0 2. iRT Software/Hardware

  • Cell processor
slide-11
SLIDE 11

Resultsf Incorporate rendering

)1

\

:!

t

ap~IQaCMe§

jnto ParaView

e Paraview (PV) is open-

source parallel large­data visualization tool

e 1. Run on two types of

supercomputing nodes

  • Multi­core cluster ­ 1, 2, 4, 8,

16 way

  • Roadrunner ­ Cell processor

e 2. Run with scan-

conversion and ray­tracing

  • PV already uses OpenGL

Need to incorporate ray tracing into PV/vtk

  • Rendering interface
  • Have ray

tracer implement rendering interface

  • Polygons, texturing,

depth buffer

  • Then parallel rendering

works as well!

slide-12
SLIDE 12

PV/vtk Rendering Performance

  • 1 Million polygons renderi

~

f6

~

1~

r K

image

Frames Rendering Software Architecture

per second

Type Nvidia Quadro 18.6 Scan OpenGL FX 5600 conversion Cell blade (16 42 Raytracing iRT SPUs)

  • 1. Vtk GPU hardware rendering performance could be improved.
  • 2. iRT is not currently ported to run under PV/vtk.

Frames per second for # of cores Rendering Type Software Architecture

1 2

4 8

16 Scan conversion Open GL Mesa Multi-core (4 quad opt.) 0.7 1.2 2.0 3.2 4.6 Raytracing Manta Multi-core (4 quad opt.) 1.6 2.8 5.6 10.9 19.4

slide-13
SLIDE 13

Networking Performance

erformance erformance

50.00 50.00 45.00 45.00 40.00 40.00

  • ~

Network

  • nly - Frames per

35.00

second

35.00

  • .-Frames per second

30.00

30.00

25.00

I

.~

fps 25.00 fps

20.00 20.00 15.00 15.00 10.00 10.00

  • 5.00

5.00 0.00 0.00 2

4

8 16 32 64 128 2

4

8 16 32 64 128

Number of processors Number of processors

  • "'- Network only - Frames per

\ second

  • .-Frames per second
  • :"".

.-.

slide-14
SLIDE 14
  • eo

~ c

co

~

..c

~

~

en

c

L-

ID

  • c

c

Q)

~

~

>

  • Q) --

~

CO >: CO j

L- L-

eo ro

a... a...

slide-15
SLIDE 15

Fu: ture Work and Cdnclusions

  • Integration of IBM Cell-
  • This preliminary study

based ray­tracer into PV suggests that: for visualization on RR

  • Multi­core processors are

platform

starting to serve some of roles of traditional GPUs such as parallel

  • Advanced ray­tracing

rendering

  • Using fast software-

based rendering methods may offer a path to utilizing our supercomputers for visualization