Assessing and Improving Large Scale Parallel Volume Rendering on the - - PowerPoint PPT Presentation

assessing and improving large scale parallel volume
SMART_READER_LITE
LIVE PREVIEW

Assessing and Improving Large Scale Parallel Volume Rendering on the - - PowerPoint PPT Presentation

Assessing and Improving Large Scale Parallel Volume Rendering on the IBM Blue Gene/P www.ultravis.org Entropy in core-collapse supernova, time step 1354 Rob Ross - ANL Hongfeng Yu - SNL California Tom Peterka Kwan-Liu Ma


slide-1
SLIDE 1

Tom Peterka

  • tpeterka@mcs.anl.gov
  • Mathematics and Computer Science Division
  • www.ultravis.org

Rob Ross - ANL Hongfeng Yu - SNL California Kwan-Liu Ma – UCD Wesley Kendall – UTK Jian Huang - UTK

Assessing and Improving Large Scale Parallel Volume Rendering on the IBM Blue Gene/P

Entropy in core-collapse supernova, time step 1354

slide-2
SLIDE 2

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

2

  • Leadership Resources

Computation, communication, and storage

slide-3
SLIDE 3

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

3

  • Ever-Increasing Scale of Data and Visualization

Problem sizes are data-dominated. Visualization is no exception.

2008 INCITE projects Computations Visualizations

Dataset Problem size (billion elements) System size (CPUs) Year Reference (et al.) Taylor-Raleigh 1.0 128 2001 Kniss Molecular Dynamics 0.1 256 2006 Childs Earthquake 1.2 2048 2007 Ma Supernova 0.6 4096 2008 Peterka Dataset Problem size (billion elements) Year PI Lifted H2 air 0.9 2008 Grout Lifted C2 H4 air 1.3 2008 Grout Supernova 1.3 2008 Blondin Turbulence 8.0 2005 Yeung Domain Data size (TB) PI Fusion 54.0 Klasky Materials 100.0 Wolverton Astrophysics 300.0 Lamb Climate 345.0 Washington

slide-4
SLIDE 4

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

4

  • Parallel Volume Rendering

Divide, conquer, and reunite

slide-5
SLIDE 5

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

5

  • Some Parallel Rendering Parameters

Knobs to turn, switches to flip, buttons to press

Argument Sample Values DataSize 1120x1120x1120 ImageSize 1600x1600 ImageType ppm, rgb, rgba IP, port 137.72.15.10, 5000 Stereo y, n NumProcs 16384 NumPipes 16 BlockingFactor 8 NumWriters 64 NumThreads 1

slide-6
SLIDE 6

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

6

  • Larger Datasets and Images

Another measure of scalability

slide-7
SLIDE 7

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

7

  • Time Distribution

Reading the data from storage dominates the total cost of a time step.

The effect of raw rendering speed is

  • minimal. Hence, s/w rendering rates are

acceptable, compared to h/w rendering. The most critical factor is parallel I/O performance, followed by interconnection performance.

11203 volume 16002 image

slide-8
SLIDE 8

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

8

  • Efficiency

Round robin static block distribution is an inexpensive load balancing scheme that is quite effective.

11203 volume 16002 image 8643 volume 10242 image

slide-9
SLIDE 9

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

9

  • Multiple Writers Performance

Improve overall output time by selecting the optimal number of writers.

64 writers best for most cases; writers need to be distributed among I/O nodes. 2048 renderers 2048 compositors 20482 image

Memory footprint per core = 70MB + 2.5KB * image size / writing_cores + 4 * volume size / rendering_cores

slide-10
SLIDE 10

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

10

  • Multithread – MPI Hybrid

Programming Model

MPI-pthread rendering MPI-only I/O and compositing

11203 volume 10242 image 4 threads 4 procs 1 node 1 node

slide-11
SLIDE 11

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

11

  • Multiple Parallel Pipelines

Hide I/O latency by extending concurrency between time steps.

8643 volume 10242 image 6X faster for same total system size when 16 pipelines are used instead of one

slide-12
SLIDE 12

Argonne National Laboratory

SC 2008 Ultrascale Visualization Workshop November 16, 2008 Tom Peterka tpeterka@mcs.anl.gov

12

  • Lessons Learned

and the road ahead

Challenges, to do

  • Other grid topology
  • In situ visualization
  • Adoption into tools
  • Other architectures
  • Other vis algorithms

Successes

  • Demonstrated scaling
  • Large data and image sizes
  • Improved compositing
  • Improved and benchmarked I/O
  • Load balancing
  • Memory scalability
  • Hybrid programming model
  • Parallel pipelines
slide-13
SLIDE 13

Tom Peterka

  • tpeterka@mcs.anl.gov
  • Mathematics and Computer Science Division
  • www.ultravis.org

Rob Ross - ANL Hongfeng Yu - SNL California Kwan-Liu Ma – UCD Wesley Kendall – UTK Jian Huang - UTK

Assessing and Improving Large Scale Parallel Volume Rendering on the IBM Blue Gene/P

Entropy in core-collapse supernova, time step 1354