D ESIGN OF COOPERATIVE VISUALIZATION ENVIRONMENT WITH INTENSIVE DATA - - PowerPoint PPT Presentation

d esign of cooperative visualization environment with
SMART_READER_LITE
LIVE PREVIEW

D ESIGN OF COOPERATIVE VISUALIZATION ENVIRONMENT WITH INTENSIVE DATA - - PowerPoint PPT Presentation

D ESIGN OF COOPERATIVE VISUALIZATION ENVIRONMENT WITH INTENSIVE DATA MANAGEMENT IN PROJECT LIFECYCLE Kenji ONO and Jorji Nonaka High-performance Computing Team Integrated Simulation of Living Matter Group Computational Science Research Program


slide-1
SLIDE 1

DESIGN OF COOPERATIVE VISUALIZATION

ENVIRONMENT WITH INTENSIVE DATA MANAGEMENT IN PROJECT LIFECYCLE

Kenji ONO and Jorji Nonaka High-performance Computing Team Integrated Simulation of Living Matter Group Computational Science Research Program RIKEN

Ultrascale Vis Workshop 2008

slide-2
SLIDE 2

TOC

 Supercomputer development project  Computer Hardware development  Grand Challenge in Life Science  Post-process system  Current status of system design

2

slide-3
SLIDE 3

NEXT-GENERATION SUPERCOMPUTER

 Development  Under initiative of MEXT  Riken is responsible for dev. in collaboration with vendors  Hybrid System composed of Scalar and Vector Units  LINPACK 10PFLOPS 3 Scalar Unit Data base, homology search Similarity with PC

  • high speed low

power CPU

  • New strong

network for enormous parallelism Vector Unit Vector processor based on the Earth Simulator

  • New generation

low-power vector processor with

  • ptical network

Front end computer

(Integrated OS)

Scalar Unit Vector Unit

共有ファイル

Nono device

Local file

Global file Control unit Processing unit

Local file

Disk unit

Solar cell Plazma Physics Climate change CFD

  • e-

e- I3

  • I
  • Homology

search

slide-4
SLIDE 4

EXPECTED MAJOR APPLICATIONS

4

Grand challenges

  • Applica'ons
  • Benchmark
  • Grand challenge
  • Focusing on life science
slide-5
SLIDE 5

ISSUES TO BE CONSIDERED

 Large-scale nature of data  computational space, time-varying, multivariate  Expected data size is order of 1PB (Peak)  Data Tsunami  How to do vis., data processing, analysis?  Depend on each researcher  Various scenarios  Complex hardware  Various configurations

5

slide-6
SLIDE 6

CURRENT STATUS AND GOAL

 System design  To develop user-friendly post-processing system  Derive useful information from Large-scale dataset  Easy to use  Enhance productivity  To assist scientific knowledge and understanding

physical phenomena

 Operating this post-process system to assist

research

6

slide-7
SLIDE 7

ISSUES ON POST-PROCESS OF LARGE-SCALE

DATASET

 Size of dataset  Space, Temporal data, multivariate  Distribution of dataset  Distributed parallel, GRID  Complexity of HW system  Heterogeneous environment, File system, Network  High cost of data copy, movement  MMU/HDD capacity  can not move any large-scale data  need appropriate tools to access large-scale data  Many cumbersome procedures  File handling, preparation of process  need an environment to focus on “THINKING” 7

slide-8
SLIDE 8

ORIENTATION OF POST-PROCESS

 do note move data  beyond that…

 Sharing data, collaboration  Remote, virtual organization

 Data intensive service  Data access, visualization, analysis, processing  Sharing data, results, knowledge, resource

 Data repository for group  Data browse, search, break down

 Comprehension of phenomena  Sharing information, knowledge

8

slide-9
SLIDE 9

WHAT WE WANT TO DO

9

slide-10
SLIDE 10

SWEETS SCIENTIFIC KNOWLEDGE DISCOVERY TOOLS

 Script based loosely coupling  script glues modules in sub-systems  scalability, flexibility and sustainability  Sub-system  Visualization  Data / project management  Automation of a routine task using a workflow  Data sharing  ...  module works independently, but provide full capability

when working together

 Efficiently extract useful information from

simulated results

10

Tools to obtain knowledge, then to magnify value of simulations

slide-11
SLIDE 11

Thinking

REQUIRED TASKS FOR RESEARCH

11

Visualiza'on Data/project mgmt Automa'on Data sharing Analysis File access Parameter edit Data copy/move Submit a batch job Program launch ... Visualiza'on Data analysis

Service (sub‐system) Low level procedure Task Cumbersome

slide-12
SLIDE 12

AN EXAMPLE OF HARDWARE CONFIGURATION

12

!"#$%&'(#")$% *+,-./0/121 30/",-04,)0'+ 50-$1!$%6$% *7$+) !899:! !$%6$%1 !899:!1 !$%6$%1;< =/$%1* =/$%1< =/$%1> 30/?1*## !),)0/)0&1*## @'A1B,+,7$($+) 8'%C!'D '+ !$%6$% 8'%C!'D '+ >-0$+) !899:!1 >-0$+)1;<

slide-13
SLIDE 13

A CONCEPT MODEL OF DATA ENTITY

13

!"#$%&' (#) *+,%

  • .%"

/#.' 1 1 1 1 1 1 1 1

slide-14
SLIDE 14

DATA ENTITY OF SWEETS SYSTEM

14

!"#$%&'()*+% ,#++%)'

  • .%/0

1'*"'(#2('3%(0*4 5)0(#2('3%(0*4 6).7(8"#9: !*'3 ;%'*(0*'* !"#$%&'(#)*$+&',-

!"#$%&'()*

!#)*$+&',- <#=()*+% ,#++%)' !*'3 1'*'9> ./$#',- 0)/&',- <#=(&#++*)0 <#=(0."%&'#"4 ?%*/(@A(#2(*($#= ;%'*(0*'*

+#,()*

  • ./%()*+%

1)2',- ?%*/()*+%(#2(*(!/% 0)/&',- A."%&'#"4

  • ./%('4:%

1.B% @)#0% C.+%(#2(&"%*'.#) C.+%(#2(9:0*'% !%"+.>>.#) D"#9: EF)%" G.)H('# @E ,#++%)' 6)H)#F)("*8 !"%I.%F(:*'3 ;%'*(0*'* ?%/*'%0(!/%> ,#)'%)'(#2(*(!/%

  • ./%()01#2

6>%"()*+% A#+*.)(#2(%7:%"'.>% ;*./(*00"%>> ,#++%)' J9'3#".'4 G#8.)(@A !%">#)*/(>%''.)8

34%"()01#2

;%'*(0*'*()*+% C*"8%' ,*'%8#"4 C4:% ;.)K(I*/9% ;*7(I*/9% J/'%")*'.I% !"#$%&'(@A G.)H*=/% G.)H(:"#8"*+

  • /*8

57'%)>.#)(2#"(*9'#+*'.&( "%8.>'"*'.#) L%4(2#"(*9'#+*'.&("%8.>'"*'.#)

5%'6(76'6()*

M#>'()*+% ,/9>'%"()*+% @!(*00"%>> G#8.)(@A !*>>F#"0 M#+%(0."%&'#"4

8#4'()*

N )

slide-15
SLIDE 15

PREREQUISITE OF VIS. SUB-SYSTEM

 Remote or local visualization  Provides unified environment by a common client  Real-time or post visualization  Basically, file based visualization  Interactive or batch visualization  Software or hardware rendering  Large-scale data handling  Parallel rendering  Platform  Linux, Windows, Max OSX, PC cluster, Supercomputer

15

Various scenarios for each researcher’s approach

slide-16
SLIDE 16

2-WAY SCENARIO

 1st step : Visualization on supercomputer  Sever client system

 with basic visualization function  Data reduction, ROI

 Operation is only batch job by its policy  2nd step : Visualization on CPU/GPU cluster  Capability of interactive visualization by GPU  Relatively small MMU (Data reduction is

necessary)

 Reduce risk of development  Reduce utility time of supercomputer  can use existing software, COTS

16

slide-17
SLIDE 17

STRATEGY: INTERACTIVITY AND SCALABILITY

17

Time # of Cores

File I/O Rendering Image composi8ng

Mul8core rendering / HW rendering LOD Data reduc8on Mul8‐Step Image Composi8on

slide-18
SLIDE 18

BINARY-SWAP IMAGE COMPOSITION

18

. . . Stage log2(n) Composi8on Nodes (n) 1 2 3 4 n‐3 n‐2 n‐1 n . . . . . . Image Collect Stage 2 Stage 1 . . . . . . . . .

Network conten8on

slide-19
SLIDE 19

MULTI-STEP IMAGE COMPOSITION

19

Binary‐Swap Binary‐Swap Step 1 Composi8on Nodes: n ... Binary‐Swap Step 2 ... . . . 1 1 2 3 m 1 2 3 m p . . . = p * m 1 p . . . 2 Local root nodes 1 Final composited image Global root node 1 n / Binary‐Tree / Direct Send

slide-20
SLIDE 20

MSIC BG/L (RIKEN)

20

BlueGene / L Dual core x 1024 nodes

slide-21
SLIDE 21

MSIC - T2K (UNIV. OF TOKYO)

21

AMD Opteron Quad Core x 4 Sockets

slide-22
SLIDE 22

OTHER SUB-SYSTEMS

 Workflow  Kepler  Data base  RDB ? XML feature  Scripting  python ... plays glue  Analysis  R  user program  Other useful existing tools...

22

slide-23
SLIDE 23

CONCLUDING REMARKS

 Design and current development status  to assist discovery and understanding  Database centered structure  sub-systems take part in the system

 Visualization  Workflow  Other programs

 resource mgmt and access control provide data sharing  reuse of existing useful software  script base  Supercomputer will be fully operational on 2011  Combining existing software, enhancing capability

23