Introduction to Scientific Visualization with ParaView July 07, 2016 - - PowerPoint PPT Presentation

introduction to scientific visualization
SMART_READER_LITE
LIVE PREVIEW

Introduction to Scientific Visualization with ParaView July 07, 2016 - - PowerPoint PPT Presentation

Introduction to Scientific Visualization with ParaView July 07, 2016 | Dr. Herwig Zilken July 07, 2016 Slide 1 Overview Introduction to ParaView Introduction to remote rendering with VNC (on JURECA) Hands On: visualizing sample


slide-1
SLIDE 1

July 07, 2016 Slide 1

Introduction to Scientific Visualization

with ParaView

July 07, 2016 | Dr. Herwig Zilken

slide-2
SLIDE 2

July 07, 2016 Slide 2

Overview

  • Introduction to ParaView
  • Introduction to remote rendering with VNC

(on JURECA)

  • Hands On: visualizing sample dataset, using

some important filters

  • Hands On: integrating HDF5/XDMF writer in

coulfrog

  • Hands On: parallel visualization setup
slide-3
SLIDE 3

July 07, 2016 Slide 3

What is ParaView?

  • pen‐source data analysis and visualization application (two‐ and

three‐dimensional data sets) built on top of VTK on top of OpenGL

  • provides a comprehensive suite of visualization algorithms
  • supports many different file formats for both loading and exporting

data sets.

  • supported platforms: Linux, Windows, Mac
  • processing Modes:
  • stand‐alone mode
  • client server configuration
  • batch
  • in‐situ
  • locally installed at JSC:
  • Linux Workstation Group
  • JURECA Visualization Partition
slide-4
SLIDE 4

July 07, 2016 Slide 4

The History of ParaView

  • 2000: collaboration between Kitware Inc. and Los Alamos National

Laboratory, funding provided by the US Department of Energy ASCI program

  • 2002: first release of ParaView
  • September 2005 ‐ May 2007: development of Paraview 3 by Kitware,

Sandia National Lab. and other partners  user interface more user friendly  quantitative analysis framework

  • June 2013: ParaView 4.0

 more cohesive GUI controls  better multiblock interaction  In situ integration into simulation and other applications (Catalyst)

  • Recent Releases: (currently ParaView 5.1)

 new VR backend (multi screen 3D projection, VR devices)

slide-5
SLIDE 5

July 07, 2016 Slide 5

Visualization Pipeline

  • concept of a visualization pipeline like VTK
  • ParaViews modular architecture

 separate processes for

  • data processing
  • rendering
  • user interface (not shown on this slide)

initial data input or generated modify the data in some way convert data into visible „objects“ Adjust the visible properties Create image

Sources Filters Mappers Actors Renderers

slide-6
SLIDE 6

July 07, 2016 Slide 6

ParaView Client

ParaView‘s Architecture

ParaView Server VTK pvpython OpenGL MPI ParaWeb Catalyst Custom App IceT … more … UI (Qt Widgets, Python Wrappings)

slide-7
SLIDE 7

July 07, 2016 Slide 7

Scientific Data

  • the data describes WHAT values are located WHERE.
  • example: what temperature is at location (x, y, z) = (1, 3, 5)
  • WHERE: the structure of the data

1. Geometry: defines the (3D) location 2. Topology: describes the connectivity (neighborhood) of points Example: three points at (x1,y1,z1), (x2,y2,z2), (x3,y3,z3) forming a triangle

  • WHAT: the attributes (values) of the data, e.g. temperature,

pressure, …

  • typically the data is discrete (not continuous), given at a set of

points in 3D space.

slide-8
SLIDE 8

July 07, 2016 Slide 8

Data Types: Structured Data

Uniform Rectilinear Grid (Image Data) (vtkImageData) Non‐uniform Rectilinear Grid (Rectilinear Grid) (vtkRectilinearData) Structured (Curvilinear) Grid (vtkStructuredData)

slide-9
SLIDE 9

July 07, 2016 Slide 9

Data Types: Unstructured Data

Polygonal Mesh (Poly Data) (vtkPolyData) Unstructured Grid (vtkUnstructuredGrid) Multi‐Block (several data sets grouped together) Hierarchical Adaptive Mesh Refinement (AMR)

slide-10
SLIDE 10

July 07, 2016 Slide 10

Data Attributes

  • data attributes at grid points (grid data, node data) or on cells

(cell data, element data)

  • general data attributes:

 Scalar  Vector  Tensor (nxn matrix)

  • attributes with special meaning for the visualization process:

 Normals (3D vector)  2D or 3D texture coordinates

  • any number of variables can be defined at points and cells
slide-11
SLIDE 11

July 07, 2016 Slide 11

Getting Started

  • ParaView can be downloaded at www.paraview.org
  • precompiled versions available for Mac OS,

Windows (32‐bit and 64‐bit) and Linux (64‐bit)

  • sources for individual installations (e.g. using the mpi‐

version tailored to your system)

  • ParaView client launches like most applications:
  • Windows: launcher located at the Start‐Menu
  • Linux: execute ParaView from a command prompt
  • Mac OS: open the application bundle that you installed
slide-12
SLIDE 12

July 07, 2016 Slide 12

User Interface

Advanced toggle: Shows/hides advanced controls

Menu Bar Pipeline Browser Properties & Information Panel Toolbars

Menu Bar: access the majority of features Toolbars: quick access to the most commonly used Pipeline Browser: Overview about the data processing pipeline Properties & Information Panel: Parameters of the selected object in the pipeline

3D View

slide-13
SLIDE 13

July 07, 2016 Slide 13

The Pipeline Browser

  • shows data readers/sources and filters
  • shows current visualization pipeline

structure

  • allows you to
  • select objects within the current

visualization pipeline

  • edit pipeline objects via the Properties Panel
  • concept of active objects:
  • the selected object in the pipeline browser is the active object
  • all changes within the properties panel will refer on what is actually

active

  • the active object is used as the default object to use for operations

like adding a filter

slide-14
SLIDE 14

July 07, 2016 Slide 14

Properties Panel

  • depends on the active
  • bject in the pipeline
  • properties section:

various parameters of the active object

  • display section:

appearance of the active

  • bject (Representation,

Color, Annotation, …)

  • view section:

adjust global visualization parameters, e.g. background color

slide-15
SLIDE 15

July 07, 2016 Slide 15

Apply/Reset

  • changes to the object properties

are not immediately applied

  • press Apply button to synchronize

your view with the parameters

  • same applies to opening a file:
  • File ‐> open just reads the header

information of the file

  • to load the data and make it visible

in the view hit the Apply button!

  • the Reset button will undo parameter changes,

but only if they are not already applied

slide-16
SLIDE 16

July 07, 2016 Slide 16

Information Tab:

  • shows information about the data

structure of the active object

  • dataset type
  • number of cells, points
  • data arrays
slide-17
SLIDE 17

July 07, 2016 Slide 17

Active Variable Controls, Representation Toolbar

Toggle Color Legend Edit Color Map

Reset Scalar Range

Representation: Surface, Wire Frame, Surface with Edges, Points,… Mapped Variable Vector Component

Can also be found in the display group of the properties panel!

  • gives fast access to most important display properties of objects
slide-18
SLIDE 18

July 07, 2016 Slide 18

Display Area

  • probably most important part
  • shows output of the current visualization pipeline
  • various possible formats

(2D and 3D views)

  • each display can be split

vertically/horizontally

slide-19
SLIDE 19

July 07, 2016 Slide 19

Controlling the camera

  • mouse interaction is mapped to camera movements as follows:
  • can be customized in the Settings menu
  • camera Controls toolbar:

Reset camera (all visible objects) maintains view but repositions camera, that entire objects are visible Zoom to box Align view to coordinate axis Zoom to data camera is placed to look at data (selected in pipeline browser)

slide-20
SLIDE 20

July 07, 2016 Slide 20

Controlling the camera (continued)

  • within the toolbar
  • f each view:
  • adjust camera:
  • 4 custom views can

be stored

  • camera parameters

(position, view direction, view angle) can be set manually

Camera Undo Camera Redo Adjust camera Toggle 3D/2D Interaction Mode

slide-21
SLIDE 21

July 07, 2016 Slide 21

Getting Data into ParaView

there are several methods loading data in ParaView:

  • generate data from SOURCES menu
  • reading data from a file
  • Menu “FileOpen”
  • command line argument ‐‐data=datafile
  • load a previously saved state file (File menu, Load State)
  • this will return ParaView to its state at the time the file was saved

(loading data, applying filters, setting preferences etc)

  • connect ParaView to a running simulation (in‐situ visualization)
slide-22
SLIDE 22

July 07, 2016 Slide 22

Applying Filters

  • achievements so far:
  • read in the data, glean some information about it
  • see the structure of the mesh, map some attribute data
  • n its surface

But! Many interesting features cannot be determined by just looking at the raw data discover much more about the data applying filters!

  • generate, extract, or derive features from the data
  • filters are attached to readers, sources, or other filters to

modify its data in some way.

  • most common filters are available in the filters toolbar:
slide-23
SLIDE 23

July 07, 2016 Slide 23

Filter Menu:

many filters in the Filters Menu

  • lists of all filters available in

ParaView (alphabetical)

  • if the filter can not be

connected to the current active object the entry is disabled

slide-24
SLIDE 24

July 07, 2016 Slide 24

Common Filters: Calculator

  • evaluates a user‐defined

expression on a per‐point or per‐cell basis

  • generates a new data attribute as

a function of given attributes

  • available attributes are listed in

the “Scalars” and “Vectors” combo boxes

slide-25
SLIDE 25

July 07, 2016 Slide 25

Common Filters: Contour

  • extracts the curves, or surfaces

where a scalar field is equal to a user‐defined value

  • this surface is often called an

isosurface

slide-26
SLIDE 26

July 07, 2016 Slide 26

Common Filters: Clip

  • intersects the geometry with a

user‐defined plane, box, sphere or cylinder

  • removes all the geometry on one

side of this plane (box, sphere, …)

slide-27
SLIDE 27

July 07, 2016 Slide 27

Common Filters: Slice

  • intersects the geometry with a

user‐defined plane, box, sphere

  • r cylinder
  • similar to clipping, except that all

that remains is the geometry where the plane is located.

slide-28
SLIDE 28

July 07, 2016 Slide 28

Common Filters: Threshold

  • extracts cells that lie within a

specified range of values

slide-29
SLIDE 29

July 07, 2016 Slide 29

Common Filters: Extract Subset

  • extracts a subset of a grid by

defining a volume of interest and a sampling rate

slide-30
SLIDE 30

July 07, 2016 Slide 30

Common Filters: Glyph

  • places a glyph (a simple shape) at

each point in a mesh

  • glyphs may be oriented by a

vector and scaled by a vector or scalar.

slide-31
SLIDE 31

July 07, 2016 Slide 31

Common Filters: Stream Tracer

  • seeds points in a vector field and

then traces those seed points through the (steady state) vector field creating streamlines

slide-32
SLIDE 32

July 07, 2016 Slide 32

Common Filters: Warp (vector)

  • displaces each point in a mesh by

a given vector field.

slide-33
SLIDE 33

July 07, 2016 Slide 33

Common Filters: (Un)Group Datasets

  • (un)combines the output of

several pipeline objects into a single multi block data set

slide-34
SLIDE 34

July 07, 2016 Slide 34

Visualizing Large Models

  • ParaView is designed for extremely large data

processing (although a little bit unstable):

  • examples from Sandia National Lab in the web:

(about 1 billion cells on 256 nodes)

  • visualizing extremely large data requires a big

amount of resources (especially memory)

  • use parallel visualization capabilities of ParaView
slide-35
SLIDE 35

July 07, 2016 Slide 35

Remote Data Visualization on JURECA

Scenario:

  • Data is stored on central GPFS
  • Visualization app is running on a server (JURECA)
  • Output is shown on your local desktop by means of
  • Redirection of X‐ and OpenGL‐ commands to your local X‐server:

slow, maybe incompatible  bad idea

  • Using the server components of “remote aware” visualization

apps (ParaView, VisIt) on JURECA and their client components on your local machine

  • Running a virtual desktop (VNC‐server) on JURECA, sending the

image of this desktop to your local VNC‐viewer. All vis software (client+server) is installed on JURECA: our recommendation.

  • In‐Situ visualization is also possible (covered this afternoon)
slide-36
SLIDE 36

July 07, 2016 Slide 36

JURECA Visualization-Resources Hardware

12 Visualization Nodes

  • 2 GPUs Nvidia Tesla K40 per node, 12 GB RAM on each card
  • 2 Login Visualization Nodes
  • jurecavis01.fz‐juelich.de (jrc1383)

jurecavis02.fz‐juelich.de (jrc1384)

  • jurecavis.fz‐juelich.de (jrc1383 or jrc1384 in round‐robin fashion)
  • 10 Batch Visualization Nodes
  • 8 nodes with 512 GB RAM ( jrc[1385-1392] )
  • 2 nodes with 1024 GB RAM ( jrc139[3,4] )
  • Special partition named “vis”
  • connection to vis batch nodes via login-nodes and ssh tunnels

(for security reasons) Visualization also possible on nodes without GPU’s (software rendering)

slide-37
SLIDE 37

July 07, 2016 Slide 37

Why are Vis Nodes needed?

  • 1. Special Software Stack on Vis Nodes:

Base Software: X‐Server, X‐Client (Window‐Manager) OpenGL (libGL.so, libGLU.so, libglx.so), Nvidia Middleware: Virtual Network Computing: VNC‐Server, VNC‐Client VirtualGL Strudel Parallel and Remote Rendering Apps, In‐Situ Visualization ParaView VisIt

  • 2. Usage Model:

Vis Nodes are available for JURECA, JUQUEEN and Non‐Project users VisIt

slide-38
SLIDE 38

July 07, 2016 Slide 38

JURECA Visualization Related Documentation

Entry point is https://trac.version.fz‐juelich.de/vis/

Docu related to VisIt: https://trac.version.fz‐juelich.de/vis/wiki/VisIt/Jureca Docu related to ParaView: https://trac.version.fz‐juelich.de/vis/wiki/ParaView/Jureca

slide-39
SLIDE 39

July 07, 2016 Slide 39

Firewall

Remote Visualization: General Setup

User’s Work‐ station user interface display images Data GPFS

vis login node vis batch node compute node

data access + image generation Jureca

  • vis login node:

‐ direct user access ‐ no accounting ‐ shared with other users ‐ no parallel jobs (no srun)

  • vis batch node:

‐ access via batch system ‐ accounting ‐ exclusive usage ‐ parallel jobs possible

slide-40
SLIDE 40

July 07, 2016 Slide 40

Remote Rendering with VNC/VirtualGL

Our recommendation is to use VNC for remote rendering on JURECA

  • Hardware rendering

(GPU acceleration) with VirtualGL

  • VNC/VirtualGL is a good

solution for many common OpenGL applications, e.g. IDL, Vapor, …

  • Can also be used for the

frontend of “remote aware” applications like ParaView and VisIt

  • User only has to install local VNC viewer
  • Desktop sharing possible
  • Cumbersome tunneling, job submitting and starting of vnc server by

hand can be avoided by using Strudel to establish the connection

slide-41
SLIDE 41

July 07, 2016 Slide 41

Strudel (ScienTific Remote Desktop Launcher)

developed for the

  • Multi‐modal Australian ScienceS Imaging and Visualisation Enviroment
  • at Monach University, Melbourne (Australia)

complex VNC scenarios become easy to use for any user

1. ssh to HPC system 2. authenticate via ssh key pair 3. submit job via slurm 4. wait for job to start (keep user informed) 5. start VNC server on node 6. establish ssh tunnel 7. start TurboVNC client and connect

complete configuration is stored in a JSON file Download JSC version of Strudel here: https://trac.version.fz‐juelich.de/vis/wiki/vnc3d/strudel

https://www.massive.org.au

slide-42
SLIDE 42

July 07, 2016 Slide 42

VNC Desktop with JSC Extension (“Profiles”)

JSC extension to VNC: vncserver –profile vis Your benefits: nice blue JSC background  clock counting up/down MOTD window CPU, memory utilization GPU utilization VNC utilization desktop symbels for vis apps, LLview, …

slide-43
SLIDE 43

July 07, 2016 Slide 43

Data GPFS

vis login node vis batch node compute node

data access + image generation Jureca

Scenario 1: Visualization on Vis Login Node with VNC

User’s Work‐ station ssh + VNC tunnel (port 590<d>) Firewall + user does not need to install the vis app (ParaView, VisIt, …) + rendering on GPU + large memory + no batch job needed + no accounting ‐ resources shared between users ‐ tunnel has to be established

slide-44
SLIDE 44

July 07, 2016 Slide 44

Scenario 1: How to Setup Visualization on Vis Login Node with VNC

  • 1. Start vnc server on vis login node

1.a. Using Strudel (very easy) ‐ https://trac.version.fz‐juelich.de/vis/wiki/vnc3d/strudel 1.b. Manually ‐ https://trac.version.fz‐juelich.de/vis/wiki/vnc3d/manual Necessary Steps: ‐ login to jurecavis ‐ create vnc password (if not already done) ‐ start vncserver, notice the display number ‐ establish a ssh tunnel with the correct port to the vis login node ‐ start local vncviewer with proper connection information

  • 2. Start your vis app

‐ load necessary modules (see documentation or use “module spider”) ‐ start “vglrun paraview” or “vglrun visit –hw‐accel”

slide-45
SLIDE 45

July 07, 2016 Slide 45

Data GPFS

vis batch node compute node

data access + image generation Jureca User’s Work‐ station ssh + VNC tunnel (port 590<d>) Firewall + user does not need to install Vis‐App (ParaView, VisIt, …) + rendering on GPU, large memory + vis server can be run in parallel (but number of vis nodes limited to 4) ‐ batch job needed, accounting ‐ tunnel has to be established

Scenario 2: Visualization on Vis Batch Node with VNC

vis login node

  • r regular

login node

slide-46
SLIDE 46

July 07, 2016 Slide 46

Scenario 2: How to Setup Visualization on Vis Batch Node(s) with VNC

  • 1. Start vnc server on vis batch node

1.a. Using Strudel (very easy) ‐ https://trac.version.fz‐juelich.de/vis/wiki/vnc3d/strudel 1.b. Manually ‐ https://trac.version.fz‐juelich.de/vis/wiki/vnc3d/manual Necessary Steps: ‐ login to any JURECA login node (vis or non‐vis) ‐ create vnc password (if not already done) ‐ generate small batch script for vnc server (if not already done) Note: you can allocate more the one vis node (max. 4) ‐ start vncserver with “sbatch ‐‐start‐x server name_of_jobscript”, notice the node name and the display number in the slurm output ‐ establish a ssh tunnel with the correct node name and port ‐ start local vncviewer with proper connection information

  • 2. Start your vis app GUI (ParaView, VisIt)

‐ load necessary modules (see documentation or use “module spider”) ‐ start “vglrun paraview” or “vglrun visit –hw‐accel”

slide-47
SLIDE 47

July 07, 2016 Slide 47

Scenario 2: Start parallel ParaView on Vis Batch Node

Once you have established a VNC session on one (or more) vis batch nodes, you can start ParaView in parallel. Notice: all resources (nodes) are already allocated after starting the vnc server with sbatch (see step 1.a., 1.b.), so just use “srun”.

  • 1. Start ParaView Servers:

‐ open command shell, load necessary modules ‐ export DISPLAY=:0.0 ‐ start servers e.g. by “srun ‐‐cpu_bind=none ‐‐ntasks=24 ‐‐gres=gpu:0 vglrun pvserver ‐‐use‐offscreen‐rendering”

  • 2. Open ParaView GUI (load modules, start “vglrun paraview”)
  • 3. Connect GUI to the pvserver (localhost, port 11111)
slide-48
SLIDE 48

July 07, 2016 Slide 48

Scenario 2: Start parallel VisIt on Vis Batch Node

Once you have established a VNC session on one (or more) vis batch nodes, you can start VisIt in parallel. Notice: all resources (nodes) are already allocated after starting the vnc server with sbatch (see step 1.a., 1.b.) .

  • 1. Open VisIt GUI (load modules, start “vglrun visit –hw‐accel”)
  • 2. Inside the VisIt GUI select the proper host profile for JURECA Vis Batch Node

(documentation and download link for predefined host profiles here: https://trac.version.fz‐juelich.de/vis/wiki/VisIt/Jureca)

  • 3. Select “File open”, in the file‐browser choose “JURECA Vis Batch Node” as host.
  • 4. Select a File, choose “localhost” as launch profile, choose number of processors
slide-49
SLIDE 49

July 07, 2016 Slide 49

Many other visualization scenarios are possible in general Some (but not all one can think of) are covered in the documentation https://trac.version.fz‐juelich.de/vis Some examples on the next slides…..

slide-50
SLIDE 50

July 07, 2016 Slide 50

Data GPFS

vis login node vis batch node compute node

data access + image generation Jureca

GUI on vis login node, server on vis batch node

User’s Work‐ station ssh + VNC tunnel (port 590<d>) Firewall + user does not need to install vis app + rendering on GPU + vis app server can be run in parallel (but number of vis nodes limited) ‐ batch job needed

slide-51
SLIDE 51

July 07, 2016 Slide 51

Data GPFS

vis login node vis batch node compute node

data access + image generation Jureca

GUI on vis login node, server on compute batch nodes

User’s Work‐ station ssh + VNC tunnel (port 590<d>) Firewall + user does not need to install vis app + vis app server can be run in parallel on a really huge number of nodes + in situ visualization possible ‐ batch job needed ‐ only software rendering (but probably not the bottleneck)

slide-52
SLIDE 52

July 07, 2016 Slide 52

Data GPFS

login node vis batch node compute node

data access + image generation Jureca

Remote Rendering (ParaView or VisIt) without VNC

User’s Work‐ station ssh + tunnel (ParaView: port 11111) Firewall

Example: ParaView

+ rendering on GPU

+ ParaView server can be run in parallel (but number of vis nodes limited) ‐ user has to install ParaView or VisIt on his workstation ‐ batch job needed ‐ ssh tunnel needed

slide-53
SLIDE 53

July 07, 2016 Slide 53

Parallel ParaView

  • typically ParaView is started in this parallel

mode:

– local client and parallel server (data server and render server in one job)  command on local client: paraview  command on remote cluster (depends on batch system): mpiexec –n <num_processes> pvserver srun –ntasks=<num_processes> pvserver  connect client and server

slide-54
SLIDE 54

July 07, 2016 Slide 54

Data Distribution

  • data needs to be distributed on a cluster (load balancing, memory

usage)

  • some visualization algorithms work directly on distributed data, e.g.

clipping

  • ther algorithms need ghost cells, e.g. external faces
  • structured data is handled automatically (partitioning and ghost cells)
  • for unstructured data use the D3 filter
  • example: extract surface filter with

and without D3:

slide-55
SLIDE 55

July 07, 2016 Slide 55

Memory Footprint

  • memory footprint of structured data is MUCH lower

than unstructured data, because topology (and maybe geometry) is implicitly given

  • many filters need to convert structured data to

unstructured data: BEWARE OF DATA EXPLOSION!

 examples: clip, threshold, ….., many, many others!  some filters reduce the dimensionality  less dangerous Examples: contour, slice, stream tracer

  • “Save” filters just don’t change the data structure

but only add new data values (attributes) Examples: calculator, gradient, curvature

slide-56
SLIDE 56

July 07, 2016 Slide 56

Remote Rendering (Parallel Rendering)

  • depth Compositing: All nodes will render their own part of the data

(local data)

  • nodes keep their color in depth buffers and cooperate to decide

what the color of each pixel is based on the depth value

  • implemented with the ICE‐T compositing lib
slide-57
SLIDE 57

July 07, 2016 Slide 57

Parallel ParaView: Test of Memory Footprint

  • in a first step, we did tests with VTK legacy and xdmf file format
  • simple synthetic test

data set: structured grid of size 1024^3 with one float value per grid point  4 GByte data

  • data is just the

distance from the grid center

slide-58
SLIDE 58

July 07, 2016 Slide 58

Input File Formats – VTK as an Example

Simple Legacy Format  polygonal, uniform, rectilinear, curvilinear, unstructured

# vtk DataFile Version 2.0 Really cool data ASCII|BINARY DATASET type … POINT_DATA n … CELL DATA n …

Header Title Data type, either ASCII or BINARY Geometry/Topology, Type is one of STRUCTURED_POINTS STRUCTURED_GRID UNSTRUCTURED_GRID POLYDATA RECTILINEAR_GRID FIELD Dataset attributes Number of data items n of each type must match the number of points or cells in the dataset

slide-59
SLIDE 59

July 07, 2016 Slide 59

VTK Legacy File Format

load data rotate image create iso surface rotate image delete pipeline

Example VTK legacy, structured data: all processes read all data, the data is cropped after reading  high intermediate memory consumption, poor I/O performance.

slide-60
SLIDE 60

July 07, 2016 Slide 60

HDF5

  • https://www.hdfgroup.org/HDF5
  • hierarchical data format: HDF5
  • efficient, portable IO library for storing scientific data
  • high level of abstraction
  • HDF5 is becoming a standard (even NetCDF is based on

HDF5)

  • HDF5 is a container for your specific data, not a

standard data format

  • BUT: HDF5 files do not contain meta‐information

needed for visualization (semantics of datasets)

  • solution: eXtensible Data Model and Format (XDMF)
slide-61
SLIDE 61

July 07, 2016 Slide 61

HDF5 General Usage

  • 1. Create File

hid_t file_id = H5Fcreate(“dset.h5”, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

  • 2. Create Dataspace (Memory Layout)

hsize_t dims[2]; dims[0] = n; dims[1] = 3; hid_t dataspace_id = H5Screate_simple(2, dims, NULL);

  • 3. Create Dataset

hid_t dataset_id = H5Dcreate2(file_id, "/vx", H5T_NATIVE_DOUBLE, dataspace_id, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

  • 4. Write Data to Dataset

herr_t status = H5Dwrite(dataset_id, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL, H5P_DEFAULT, &(vx[0]));

  • 5. Close Dataset, Dataspace and File

status = H5Dclose(dataset_id); status = H5Sclose(dataspace_id); status = H5Fclose(file_id);

slide-62
SLIDE 62

July 07, 2016 Slide 62

XDMF

  • http://www.xdmf.org
  • description of the structure of HDF5 datasets, e.g.

what data defines coordinates, what data defines attributes, …

  • small XML files (light data) in addition to (large)

data files (heavy data, typically HDF5)

  • light data: XML file containing XDFM language

statements and references to datasets in heavy data

  • heavy data: HDF5 files (or binary)
slide-63
SLIDE 63

July 07, 2016 Slide 63

XDMF: Example for Structured Grid

<?xml version ="1.0" ?> <!DOCTYPE xdmf SYSTEM "Xdmf.dtd" []> <Xdmf Version="2.0"> <Domain> <Grid Name="sphere" GridType="Uniform"> <Topology TopologyType="3DCoRectMesh" NumberOfElements="1024 1024 1024"/> <Geometry GeometryType="ORIGIN_DXDYDZ"> <DataItem Dimensions="3" NumberType="Float" Precision="4" Format="XML">

  • 512.000000 -512.000000 -512.000000

</DataItem> <DataItem Dimensions="3" NumberType="Float" Precision="4" Format="XML"> 1.000000 1.000000 1.000000 </DataItem> </Geometry> <Attribute Name="distance" AttributeType="Scalar" Center="Node"> <DataItem Dimensions="1024 1024 1024" NumberType="Float" Precision="4" Format="HDF"> /viswork/testDataZilken/test_4GB.h5:/sphere </DataItem> </Attribute> </Grid> </Domain> </Xdmf>

slide-64
SLIDE 64

July 07, 2016 Slide 64

XDMF: Example for Structured Grid

domini@node07:/viswork/testDataZilken> h5dump --header test_4GB.h5 HDF5 "test_4GB.h5" { GROUP "/" { DATASET "sphere" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 1024, 1024, 1024 ) / ( 1024, 1024, 1024 ) } } } }

slide-65
SLIDE 65

July 07, 2016 Slide 65

Parallel data distribution: lessons learned so far

  • distribution of data in parallel ParaView depends

strongly on the reader and on the data type (structured/unstructured)

  • for some readers, there is a description of the data

parallelism: http://www.paraview.org/Wiki/ParaView/ParaView_Re aders_and_Parallel_Data_Distribution

  • use a D3 filter when dealing with unstructured data to

uniformly distribute it across the processors into spatially contiguous regions

  • it’s hard to find comprehensive information on this

topic