on tacc s s tampede knl
play

ON TACC S S TAMPEDE -KNL Paul A. Navrtil, Ph.D. Manager Scalable - PowerPoint PPT Presentation

SDV IS AND I N -S ITU V ISUALIZATION ON TACC S S TAMPEDE -KNL Paul A. Navrtil, Ph.D. Manager Scalable Visualization Technologies pnav@tacc.utexas.edu 1 High-Fidelity Visualization Natively on Xeon and Xeon Phi 2 O UTLINE Stampede


  1. SDV IS AND I N -S ITU V ISUALIZATION ON TACC’ S S TAMPEDE -KNL Paul A. Navrátil, Ph.D. Manager – Scalable Visualization Technologies pnav@tacc.utexas.edu 1

  2. High-Fidelity Visualization Natively on Xeon and Xeon Phi 2

  3. O UTLINE „ Stampede Architecture „ Stampede – Sandy Bridge „ Stampede - KNL „ Stampede 2 – KNL + Skylake „ Software-Defined Visualization Stack „ VNC „ OpenSWR „ OSPRay „ Path to In-Situ „ ParaView Catalyst „ VisIt Libsim „ Direct renderer integration 3

  4. S TAMPEDE A RCHITECTURE 4

  5. S TAMPEDE S ANDY B RIDGE „ 16 Large Memory nodes, each with: „ 4x Intel Xeon E5-4650 “Sandy Bridge” „ 2x NVIDIA Quadro 2000 “Fermi” „ 1 TB RAM „ 128 GPU nodes, each with: „ Mellanox FDR Interconnect „ 2x Intel Xeon E5-2680 „ 6400 compute nodes, each with: „ 1x Intel Xeon Phi SE10P „ 2x Intel Xeon E5-2680 “Sandy Bridge” „ 1x NVIDIA Tesla K20 “Kepler” „ 1x Intel Xeon Phi SE10P „ 32 GB RAM „ 32 GB RAM / 8 GB Phi RAM 5

  6. S TAMPEDE KNL „ Notes: „ Shared $WORK and $SCRATCH Separate $HOME directories „ Separate Login Node login-knl1.stampede.tacc.utexas.edu „ Deployed in 2016 as planned „ Login is Intel Xeon E5-2695 upgrade to Stampede “Haswell” „ Compile on compute node „ First KNL-based system in Top500 or use “ –xMIC-AVX512 ” on login „ Intel OmniPath interconnect „ “normal” and ”development” „ 508 nodes, each with: queues are Cache-Quadrant „ Other MCDRAM configs available „ 1x Intel Xeon Phi 7250 by queue name „ 96 GB RAM + 16 GB MCDRAM 6

  7. 7

  8. S TAMPEDE 2 ( COMING 2017) „ ~18 PF Dell Intel Xeon + Intel Xeon Phi system „ Combine KNL + Skylake + OmniPath + 3D XPoint „ Phase 1: Spring 2017 „ Stampede KNL + 4200 new KNL nodes + new filesystem „ 60% of Stampede Sandy Bridge to remain operational during this phase „ Phase 2: Fall 2017 „ 1736 Intel Skylake nodes „ Phase 3: Spring 2018 „ Add 3D XPoint memory to subset of nodes 8

  9. K EY A RCHITECTURAL T AKE -A WAY „ Current and near-future cyberinfrastructure will use processors with many cores „ Each core contains wide vector units: use them for max utilization (e.g., AVX512 ) „ Fortunately the Software-Defined Visualization stack is optimized for such processors! „ Use your preferred rendering method independent of the underlying hardware „ Performant rasterization „ Performant ray tracing „ Visualization and analysis on the simulation machine 9

  10. S OFTWARE -D EFINED V ISUALIZATION – W HY ? 100 F ILE S IZE 10 G BPS 1 G BPS 300 M BPS 54 M BPS G BPS 1 GB < 1 sec 1 sec 10 sec 35 sec 2.5 min 1 TB ~100 sec ~17 min ~3 hours ~10 hours ~43 hours 1 PB ~1 day ~12 days ~121 days >1 year ~5 years Increasingly Difficult to Move Data from Simulation Machine 10

  11. S OFTWARE -D EFINED V ISUALIZATION 11

  12. S OFTWARE -D EFINED V ISUALIZATION – W HY ? 12

  13. S OFTWARE -D EFINED V ISUALIZATION – W HY ? 13

  14. S OFTWARE -D EFINED V ISUALIZATION – W HY ? 14

  15. S OFTWARE -D EFINED V ISUALIZATION – W HY ? 15

  16. S OFTWARE -D EFINED V ISUALIZATION S TACK „ OpenSWR Software Rasterizer „ openswr.org „ Performant rasterization for Xeon and Xeon Phi „ Thread-parallel vector processing (previous parallel Mesa3D only has threaded fragments) „ Support for wide vector instruction sets, particularly AVX2 (and soon AVX512) „ Integrated into Mesa3D 12.0 as optional driver (mesa3d.org) „ Best Uses „ Lines „ Graphs „ User Interfaces 16

  17. S OFTWARE -D EFINED V ISUALIZATION S TACK „ OSPRay Ray Tracer „ ospray.org „ Performant ray tracing for Xeon and Xeon Phi incorporating Embree kernels „ Thread- and wide-vector parallel using Intel ISPC (including AVX512 support) „ Parallel rendering support via distributed framebuffer „ Best Uses „ Photorealistic rendering „ Realistic lighting „ Realistic material effects „ Large geometry „ Implicit geometry (e.g., molecular ”ball and stick” models) 17

  18. S OFTWARE -D EFINED V ISUALIZATION S TACK „ GraviT Scheduling Framework „ tacc.github.io/GraviT/ „ Large-scale, data-distributed ray tracing (uses OSPRay for rendering engine target) „ Parallel rendering support via distributed ray scheduling „ Best Uses „ Large distrubted data „ Data outside of renderer control „ Incoherent ray-intensive sampling (e.g., global illumination approximations) 18

  19. OSPR AY T EST S UITE – S AMPLE I MAGES Test 0 Test 1 Test 4 Test 2 Test 3 Test 5 Test 7 Test 8 Test 6 19

  20. OSPR AY T EST S UITE – MCDRAM P ERFORMANCE R ESULTS 20

  21. P ARA V IEW T EST S UITE – M ANY S PHERES 21

  22. Likely VNC limited 22

  23. Likely VNC limited 23

  24. Definitely VNC limited! 24

  25. FIU C ORE S AMPLE – S AMPLE I MAGE 25

  26. Likely VNC limited 26

  27. Likely VNC limited 27

  28. Definitely VNC limited! Likely hitting VNC desktop 28 limits

  29. P ATH TO I N -S ITU V ISUALIZATION 29

  30. W HY I N -S ITU V ISUALIZATION ? „ Processors (like KNL) enabling larger, more detailed simulations „ File system technologies not scaling at same rate (if at all….) „ Touching disk is expensive: „ During simulation: time checkpointing is (often) not time computing „ During analysis: loading the data is (often) the overwhelming majority of runtime „ In-situ capabilities overcome this data bottleneck „ Render directly from resident simulation data „ Tightly coupled vis opens doors for online analysis, computational steering, etc 30

  31. C URRENT I N -S ITU O PTIONS „ Simulation developer „ Implement visualization API (ParaView Catalyst, VisIt libsim, VTK) „ Implement data framework (ADIOS, etc) „ Implement direct rendering calls (OSPRay API, etc) „ Simulation user „ Hope the developers do one of the above J „ Do one of the above yourself L „ Hope technology keeps post-hoc analysis viable (3D XPoint NVRAM might help) 31

  32. I N -S ITU V ISUALIZATION API S „ ParaView Catalyst (and Cinema) (www.paraview.org/in-situ/) „ VisIt Libsim (www.visitusers.org/index.php?title=Libsim_Batch) „ Direct VTK integration (www.vtk.org) „ Visualization ops already implemented „ Need coordination b/t teams to ensure simulation and vis performance Image courtesy of Kitware Inc. 32

  33. I N -S ITU -C OMPATIBLE D ATA F RAMEWORKS „ ADIOS – https://www.olcf.ornl.gov/center-projects/adios/ „ Damaris – https://hal.inria.fr/hal-00859603/en „ DIY – http://www.mcs.anl.gov/~tpeterka/software.html „ GLEAN – http://www.mcs.anl.gov/project/glean-situ-visualization-and-analysis „ SCIRun – http://www.sci.utah.edu/cibc-software/scirun.html „ (Possibly) more invasive implementation effort „ (Possibly) broader benefits beyond visualization (framework now controls data) „ Requires engagement from simulation team to ensure performance and accuracy 33

  34. I N -S ITU D IRECT R ENDERING „ Render directly within simulation using API (e.g., OSPRay, OpenGL, etc) „ Must implement visualization operations within simulation code „ Lightest weight, lowest overhead „ Requires visualization algorithm knowledge for efficient implementation „ Locks in particular rendering and visualization modes 34

  35. I N -S ITU F UTURE ? Useful perspectives at ISAV – http://conferences.computer.org/isav/2016/ 35

  36. TACC/K ITWARE IPCC – U NIMPEDED I N S ITU V ISUALIZATION ON I NTEL X EON AND I NTEL X EON P HI „ Optimize ParaView Catalyst for current and near-future Intel architectures „ KNL, Skylake, Omnipath, 3D XPoint NVRAM „ Use Stampede-KNL as testbed to target TACC Stampede 2, NERSC Cori, LANL Trinity „ Focus on data and rendering paths for OpenSWR and OSPRay „ Parallelize VTK data processing filters „ Catalyst integration with simulation „ Targeted algorithm improvements „ Increase processor and memory utilization 36

  37. T HANK YOU ! pnav@tacc.utexas.edu 37

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend