[PPT] - Introduction to Grid Computing / PowerPoint Presentation, free download

SLIDE 1

Introduction to Grid Computing

東京工業大学学術国際情報センタ

/数

理・計算科学専攻/JST 松岡聡

matsu@is.titech.ac.jp

SLIDE 2

従来の高性能計算(HPC)

ベク

トルベクトル(

(並列

並列)

)スパコ

ンスパコン

単一の計算機を多人数で共有

単一の計算機を多人数で共有

極少数のユーザ

極少数のユーザ

特殊なテクノ

ロジ・アーキテクチャ特殊なテクノロジ・アーキテクチャ

大変高価

大変高価(

(一台数億円～数百億円

一台数億円～数百億円)

)

使いにく

い使いにくい(

(近年は

近年はUnix

Unixも

も

) )

バッ

チシステムバッチシステム

物理シミ

ュレーションが主流物理シミュレーションが主流

構造計算・

流体・他の構造計算・流体・他のPDE

PDE・

・

QCD QCD

Sparse CG, Monte Carlo, etc.etc.

SLIDE 3

新しい高性能計算の潮流新しい高性能計算の潮流（（

New HPC Paradigms New HPC Paradigms）

）

「

コモディティ化]

ク

ラスタ計算 (Cluster Computing)

グローバル計算 (Grid Computing) 両者と

も、

HPCを遼に安価、

汎用、普及の可能性

21世紀にはHPCの主流に広範囲なHPCの普及へ

大規模問題の世界記録の達成、

未解決問題の解決

例: ORにおける NUG30問題

複雑なアルゴリ

ズムの実応用・科学技術への適用

Economic Simulation, Biochemistry, Architecture, Control Theory, Architecture Simulation, Planning

SLIDE 4

インターネットの利用法

通信媒体

電子メ

ール、遠隔会議、

CSCW、

など

広域データ

ベース

Web、

Gopher, ftpアーカイブ、 netnews、

など

計算力の資源(?)

Appletなど

「

計算」に関する標準的な基盤はまだない ⇒グローバルコンピューティング、 “The Grid”

SLIDE 5

F a s t E t h e r n e t A T M M y r i n e t MPI_SEND(...) MPI_RECEIVE(...) MPI_ISEND(...) 複数のネットワークボード大容量ディスク

グローバルコンピューティング ( G r i d ) へ

イ

ンターネット上の超分散並列「仮想並列コンピュータ」による計算資源の共用

計算イ

ンフラ・

DBイ

ンフラ・

UIイ

ンフラ

ベクタベクタ/ / MPP MPP ( ( 超並列計算機超並列計算機) )

Intranet+ Intranet+ Internet Internet

高速ネットワークによる高速ネットワークによるワークステーションクラスタワークステーションクラスタ

エンドユーザエンドユーザ

SLIDE 6

グローバルコンピューティングへ( 2 )

広域のネッ

トワーク上の高性能分散計算

High-Performance Distributed

Computing 「

超広域高性能計算」

仮想的な大規模並列計算機

Metacomputing[Smarr87] “The GRID” [Fosterら

98]

次世代のイ

ンターネットの計算にまつわるソフトウェア基盤

既存のソ

フト基盤の上位レイヤ

サービスと

プロトコルの提供・標準化 Grid book picture here

SLIDE 7

グローバルコンピューティングの応用分野

あら

ゆる科学技術の分野

計算物理、

計算化学、生物情報、医療情報、宇宙科学、材料系、原子力、機械工学、オペレーションズリサーチ、などなどなど

大規模並列計算、

大容量リモートデータ処理、リモートセンシングの融合

エンジニアリ

ング用の専用並列計算機の置き換え

“E-Science”, “TeleScience”

SLIDE 8

On-demand creation Virtual Computing Systems Medium for Virtual Organizations

The Grid: The Web on Steroids

http: / / http: / /

Web: Uniform access to HTML documents Grid: Flexible, high-perf access to all significant resources

Sensor nets Data archives Com puters Software catalogs Colleagues

SLIDE 9

App. Example Large-scale Quadratic

Assignment Problem w/MW

Exact solution of “nug30” quadratic

assignment problem on June 16, 2000

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13

,26,17,30,6,20,19,8,18,7,27,12,11,23

Used “MW” framework that maps branch and

bound problem to master-worker structure

Condor-G delivered 3.46E8 CPU seconds in 7

days (peak 1009 processors), using parallel computers, workstations, and clusters

MetaNEOS: Argonne, Northwestern, Wisconson

SLIDE 10

NUG30のGrid上での実行

Condorシステム

Grid上で、

空いている

CPU資源を

提供

MWラ

イブラリ

Condorを

用い,大規模

Master-Worker, Branch-and-Bound計

算を行う

世界規模でMPPスパコ

ンからクラスタ、机上の未使用のWorkstation まで活用

対故障性と

リカバリが鍵

SLIDE 11

従来分散コンピューティング技術と

Grid

従来の分散コ

ンピューティング技術

telnet, rloginなど分散OS技術 Message Passing - PVM, MPI WWW 技術 - HTTP/HTML/CGI ORB (Object-Request Broker)- CORBA, DCOM, Java RMI Agent技術 (Plangent, Aglets, Voyagerなど) などなど

どれも

(そのままでは)Grid(グローバルコ

ンピューティング)には適していない

部分には使えるが。

。。

SLIDE 12

グローバルコンピューティングの技術的要請

拡張性セキュ

リティ

不特定ユーザモービルコード

対故障性あら

ゆる

Heterogeneity

言語、

OS、

ハードウェア、管理ポリシ

高性能

HPC, HTC 高バンド

幅高レーテンシ対応

サイ

トの独立性

root権限なし

スケーラ

ビリティ

世界規模へ

資源配分

計算資源、

リモートセンサーなど

SLIDE 13

世界の現状: 各種Gridプロジェクト

80年代 - 分散コンピューティ

ング

90年代初期 - ギガビッ

トテストベッド

主にネッ

トワーキングの研究

I-Way 1995

アプリ

ケーションの feasibilityが主

Alliance (NCSA) Virtual Machine Room PACIs (NCSA/SDSC NSF National Technology Grid) 1998～ NASA Information Power Grid 1999~ ASCI DISCOM 1999~ GriPhyN (Grid Physics Network), PPDG, 2000~ eGrid (European Grid), (EU/CERN) DataGrid, 2000~ ApGrid (Asia-Pacific Grid) 2000~ NCSA-SDSC Distributed Terascale System 2001~ , IDVGL

(Distributed Virtual Grid Lab) 2001~

SLIDE 14

GUSTO Computational Grid

SLIDE 15

Emerging Production Grids

NASA Information Power Grid NSF National Technology Grid NPACI (SDSC), Alliance (NCSA)

SLIDE 16

NPACIプロジェ

クト

主に大学、

およびNCSA,SDSC,

30+ の大学

5年ないし

は10年の大規模プロジェクト

SDSCの50%のリ

ソースは

NPACIへ

CS+ アプリ

ケーション

本物の作成

ネッ

トワークは通常ネットワーク

NGI - vBNS (Internet2) Avelin (OC192)

San Diego Supercomputing Center

SLIDE 17

NPACIプロジェ

クト

(２ )

ソ

フトウェアインフラ

: Globus+ Legion+ NWSなど

アプリ

ケーション

neural science: Mark Ellisman

USCD

earth systems: Berneard

Minster (SIO)

molecular structure: Russ

Altman (Stanford)

engineering: Tensley Oden

(UTexas)

NPACIの技術VIPの一人の SDSC Regan Moore氏と HPSS storageパネル

SLIDE 18

NPACIプロジェ

クト

(3)

Digital Sky Survey metacomputing, parallel tools, data intensive

computing, remote interaction

Globus, Legion Kelp (Scott Baden, UCSD) Scalable dynamic data structures (Utexas, Jim Brown) Chaos (Joel Saltz, Maryland) Parallelizing Compiler (Kathy Yellick, UCB)

SLIDE 19

Emerging User Communities

NSF Network for Earthquake

Engineering Simulation (NEES)

Integrated instrumentation,

collaboration, simulation

Grid Physics Network (GriPhyN)

ATLAS, CMS, LIGO, SDSS World-wide distributed analysis of

Petascale data

Access Grid: supporting group

based collaboration

SLIDE 20

Grid用のソ

フトウェア、ツール

Toolkits, Framework

Globus, Legion, AppLes

Message Passing

MPICH-G2

Distributed collaboration

CAVERNsoft, Access Grid

High-throughput computing

Condor-G, Nimrod-G

Distributed data

management & analysis

Data Grid toolkits

GridRPC

Ninf, Netsolve, Nimrod

Desktop access to Grid

resources

Commodity Grid Toolkits

(CoG Kits)

Performance Monitoring

NWS

Commercial

United Devices Entropia Platform Computing Parabon

SLIDE 21

The Need for Grid Services

Remote access Remote monitor Information services Fault detection . . . Resource mgmt Collaboration Tools Data Mgmt Tools Distributed simulation . . . net

SLIDE 22

The Globus Project: Developing a Grid Services Architecture

Argonne National Lab/USC-ISI

www.globus.org The most popular Grid “Toolkit”

Developed as a key enabling mechanism the Grid

Security Infrastructure

Uniform authentication & authorization mechanisms in multi-

institutional setting

Single sign-on, delegation, identity mapping Public key technology, SSL, X.509, GSS-API

Used to construct Grid resource managers that provide

secure remote access to

Computers: GRAM server (HTTP), secure shell Storage: storage resource managers, GSIFTP Networks: differentiated service mechanisms

Globus project Co-PI: Carl Kesselman

SLIDE 23

tomographic reconstruction real-time collection wide-area dissemination desktop & VR clients with shared controls

Advanced Photon Source

Globus Application Example: Online Instrumentation

archival storage DOE X-ray grand challenge: ANL, USC/ ISI, NIST, U.Chicago

SLIDE 24

The Global Grid Forum

http://www.gridforum.org

Grid技術の研究推進、

普及、標準化など

1999年にGrid Forumと

して設立(IETFに習う

)

2000年後半eGrid Forumと

合併、

GGFへ

議長: Charlie Catlette氏 (Argonne 国立研) 13のWorking Group 年3回の GGF meeting 我が国から

は村岡氏(Ａｄｖｉｓｏｒｙ

), 関口氏, 松岡(Steering)

2001年GGF1@Amsterdam (w/EU DataGrid)

400人以上の参加者

次回2001年7月Washington D.C.

2002年にも

我が国で?

Asia-Pacificより

Critical Massの参加者が必要

SLIDE 25

我々のNinf (GridRPC)プロジェクト我々のNinf (GridRPC)プロジェクト

Grid上の機種、

OS独立な高性能 RPC システム

Fortran, C/C+ + , Java, Mathematica, COM(Excel)

ユーザの視点: 通常のラ

イブラリ

動的、

かつ数値計算ライブラリに特化したNinf RPC IDL & プロトコル

自動的資源配分

メ

タサーバによる適切なNinfサーバへの計算の割り当て

並列処理のサポート

ク

ライアント側: タスクパラレル、トランスアクション

サーバ側: データ

パラレル (タスクパラレルも

)

WWWや分散DBのデータ

を直接計算に

NinfDB, WebAccess, Matrix Workshop

組織内と

不特定ユーザを対象としたsecurity

Campus-Wideからグローバルコンピューティ

ングへ

www.ninf.org

SLIDE 26

Ninf プロジェ

クトメンバー

Ninf プロジェ

クトメンバー

工技院・

産総研/筑波大/ 物質研

関口

智史(産総研)

佐藤

三久(つくば大)

中田

英基(産総研)

高木

浩光(産総研)

田中

良夫(産総研)

建部

治 (産総研)

東工大

松岡

聡

合田

憲人

小川

宏高

その他学生

その他京都大学など共同研究

Netsolve, NWS (Tennesse大)

Jack Dongarra Rick Wolski

AppLeS (UCSD)

Fran Berman Henri Casanova

Globus

Ian Foster (Argonne) Carl Kesselman (USC/ISI) Etc.

SLIDE 27

Ninf GridRPC API

引数に対し

、共有メモリのイメージを提供

動的なI DLによる指定、

引数間依存性解析など

非同期呼び出し

、トランスアクション

セ

キュリティなど、 Gridサービスなどは自動化

Ninf_call(FUNC_NAME, ....);

FUNC_NAME =

ninf:/ / HOST:PORT/ ENTRY_NAME

C, C+ + , Fortran, Java, Lisp, COM,

Mathematica, ...

Ninf_call

double A[n][n],B[n][n],C[n][n]; /* Data Decl.*/ dmmul(n,A,B,C); /* Call local function*/ Ninf_call(“dmmul”,n,A,B,C); /* Call Ninf Func */ double A[n][n],B[n][n],C[n][n]; /* Data Decl.*/ dmmul(n,A,B,C); /* Call local function*/ Ninf_call(“dmmul”,n,A,B,C); /* Call Ninf Func */

“Ninfy” via IDL descriptions

SLIDE 28

Examples of GridRPC/NES systems

Netsolve (UTK)
Ninf (AIST/TITECH)
(Nimrod (/G))(Monash)
Punch (Purdue)
RCS
CORBA-based sys.

(several)

…etc.

Clusters

Network Layer Condor Globus

Higher-level Grid Middleware e.g.

GFarm

NWS

NES

Application

SLIDE 29

Architectural Overview Ninf (GridRPC) System

Client

Client Side Server Side

Client Server Server Monitor/ Client Proxy Monitor/ Server Proxy

MetaServer (Agent) Directory Service DB

Scheduler Monitor/ Probe

GridRPC

Throughput Measurement Load Measurement

: Ninf_call(“linpack”, ..); : Program Ninf Client Library Ninf Client

Ninf Library (Ninf Executable) Ninf Library (Ninf Executable)

Exec

SLIDE 30

Basic Ninf Client API

Ninf_call(FUNC_NAME, ....); FUNC_NAME = NAME |

ninf://HOST:PORT/ ENTRY_NAME

API for C, C+ + , Fortran, Java, Lisp, COM,

Mathematica, Jini (JiPANG)...

No client stub generation (c.f., CORBA)

Ninf_call

double A[n][n],B[n][n],C[n][n]; /* Data Decl.*/ dmmul(n,A,B,C); /* Call local function*/ Ninf_call(“dmmul”,n,A,B,C); /* Call Ninf Func */ double A[n][n],B[n][n],C[n][n]; /* Data Decl.*/ dmmul(n,A,B,C); /* Call local function*/ Ninf_call(“dmmul”,n,A,B,C); /* Call Ninf Func */

“Gridify” via IDL descriptions

SLIDE 31

Ninf Interface Description (Ninf IDL) Ninf Interface Description (Ninf IDL)

IDL information:

library function’s name, and its alias (Define) arguments’ access mode, data type (mode_in, out, inout, ...) required library for the routine (Required) computation order (CalcOrder) source language (Calls)

De fine dmmul(long mode _in int n, mode _in double A[n][n], mode _in double B[n][n], mode _out double C[n][n]) “ de sc ription “ Re quire d “libXXX.o” Calc Or de r n^3 Ca lls “C” dmmul(n,A,B,C); De fine dmmul(long mode _in int n, mode _in double A[n][n], mode _in double B[n][n], mode _out double C[n][n]) “ de sc ription “ Re quire d “libXXX.o” Calc Or de r n^3 Ca lls “C” dmmul(n,A,B,C);

SLIDE 32

“Thin Client” management in

Netsolve, Ninf

Two-phase, runtime exchange of interface info

No client stub routines (cf. CORBA, more like Java Jini) No modification of client program when server’s libs updated Client library stays relatively static

Client Program

Ninf Server

Ninf library program

Interface Info Interface Info

Interface Request Interface Info.

Interface Info

Argument Result

Client Library Stub Program

SLIDE 33

Using Ninf to “Gridify” a Library/Application

(1)Write an (Ninf) IDL for the library/app (2)Run Ninf interface generator on server

stub programs and Makefile

(3)Compile the library program and link with Ninf stub

Ninf executable

(4)Register Ninf executables with Ninf server (5) Your app/lib is now Gridified---Away you go!

SLIDE 34

A Simple Programming Example: NAS Parallel Benchmark EP

for (i = 0; i < PU; i+ + ){ Ninf_call_async(buffer, i, NPP, ktmp+ i, xsumtmp+ i, ysumtmp+ i, qtmp[i]); } Ninf_wait_all();

Ninf_transaction_begin(); for (i = 0; i < PU; i+ + ){ Ninf_call(buffer, i, NPP, ktmp+ i, xsumtmp+ i, ysumtmp+ i, qtmp[i]); } Ninf_transaction_end( ) ; if (ptid != PvmNoParent){ pvm_initsend(PvmDataDefault) ; pvm_pkint(&k, 1, 1); pvm_pkdouble(&xsum, 1, 1); pvm_pkdouble(&ysum, 1, 1); pvm_pkint(q, 10, 1); pvm_send( ptid, M_RES); } else { for(i = 1; i < PU; i+ + ){ pvm_recv( tids[i], M_RES); pvm_upkint( &ktmp, 1, 1); pvm_upkdouble( &xsumtmp, 1, 1) ; pvm_upkdouble( &ysumtmp, 1, 1) ; pvm_upkint( qtmp, 10, 1); : }

Ninf EP PVM EP

SLIDE 35

Why not just use CORBA, then?

Why define GridRPC/NES for Grids?

General-purpose ORBs such as CORBA sufficient?

↓

Our paper “Are Global Computing Systems Useful ?”[IEEE IPDPS2000] Compare qualitative usability as well as the performance

Ease of writing and maintaining client programs Ease of installing and managing the whole system Performance for Grid-enabling a library/application

SLIDE 36

GridRPC/NES Applications and their Programming Models

Portals and Application Servers (ASP-like)

Gridified Numerical Libs (Gridified Scalapack, etc.) Gridifying Application Services as Portals (NetCFD)

Parameter Sweep (Parallel Fork-Join)

Netsolve/AppLes: MCell (Neuroscience) Ninf: DOS, N-cyclic polynomial satisfaction (Operation

Research)

Complex

Netsolve: Various apps on SinRG project (Pipeline) Ninf: OR SCRM optimizer (Iterative + PS), BMI optimizer

(Branch&Bound)

Gfarm DataGrid Project (New version of Ninf/GridRPC,

Massively Data Intensive)

SLIDE 37

Portal Applications and Tools : NetCFD

Ninf computational component for CFD

Parallelized CFD program is “ninfied” on Ninf server. providing an interface to a parallel CFD program running

n MPP, PC cluster,...

Use Callbac k func tion to dr ive Br

wse r

GUI

http:/ / pdplab .trc.r wc p.or .jp/ ne tCF D/

SLIDE 38

PRESTO Grid Clusters at Matsuoka Lab (2Q2001)

Presto I

Presto I

64 Celeron500Mhz, 384MB/ node

Linux + RWC Score + our stuff

Semi production, parallel OR

Semi production, parallel OR algorithm on the Grid algorithm on the Grid

Prospero (Presto I I )

Prospero (Presto I I )

256

256 procs procs (64-node PI I I - (64-node PI I I - 800x2SMP 640MB + 128-node 800x2SMP 640MB + 128-node Celeron- Celeron- 800 256MB), 2-trunked 800 256MB), 2-trunked 100Base-T, 3TB storage 100Base-T, 3TB storage

General-purpose cluster research,

General-purpose cluster research, Grid simulation, Grid app. Run Grid simulation, Grid app. Run (incl. over the Pacific) (incl. over the Pacific)

Presto I I I

Presto I I I

Athlon

Athlon 78 78 Procs Procs, 1.33Ghz, > , 1.33Ghz, > 768MB, 768MB, Myrinet Myrinet 2K, 15TB 2K, 15TB Storage Storage

“

“Gfarm Gfarm” ” Prototype Prototype

Pom

Pom

Heterogeneous Development

Heterogeneous Development Cluster Cluster

12-node, PI I I -500Mhzx2 or

12-node, PI I I -500Mhzx2 or Celeron Celeron 300x2, etc. 300x2, etc.

Parakeet

Parakeet

Plug & Play Clustering

20 High-Performance Notebooks

20 High-Performance Notebooks (600Mhz Mobile (600Mhz Mobile Celeron Celeron) )

Leverage

Leverage Commodiy Commodiy PC, PC, ~ ~ 440procs,

440procs, ~ 500GFLOPS,~ 40KVA ~ 500GFLOPS,~ 40KVA

Greater than

Greater than Titech Titech GSIC GSIC (CC) @ 1/50 cost (CC) @ 1/50 cost

700

700 procs procs, 1.5 , 1.5 TeraFlops TeraFlops by by 1-2Q2002 1-2Q2002

> 1.5 > 1.5 Teraflops by Teraflops by 1Q 2002 1Q 2002

SLIDE 39

TITECH Campus Grid Proposal (“Field of Dreams”)

学内 Titech Gridユーザ

専攻のGridクラスタを優先利用

(数百GFLOPS, 数十TeraBytes)

学内の他のGridクラスタの遊休資源を活用
莫大な観測および計算データの格納およ

び学内外の研究者との容易な共有

将来は他のキャンパスや、世界中のGridの

計算およびストレッジ資源活用可能

情報センターに設置される各専攻のクラスタの仮想設置の集合体各専攻のクラスタ計算機

コモディティ技術によりスパコ

ンの50-100倍のC/P

128プ

ロセッサ, 数百GFLOPS, 数十TeraBytes

情報センターによる集中管理
各種クラスタ並列、およびGrid

ソフトウェア

Super TI TANET

世界ではGridが新たなE-Scienceのインフラとして急速に研究開発が

NSF Alliance/NPACI (上図)
NASA PDG 1999
DOE ASCI DISCOM 1999
NSF GriPhyN (Grid Physics Network) 2000~
CERN/EU DataGrid 2000~
eGrid (European Grid),
ApGrid (Asia-Pacific Grid) 2000~

各プロジェクト数十億規模、 3-10 年間の予算国際フォーラム: Global Grid Forum (1999)

14のWorking Group
Grid 技術の普及、標準化
前回Amsterdamでは400人以上が世界中から参加
我が国からは東工大、早大、阪大、産総研が参加、

東工大はSteering Groupのメンバー Gridのインフラソフトウェアの開発

Globus (USC/ISI)
Legion (U. Virginia)
Netsolve,

NWS (U. Tennessee)

Nimrod (U. Melbourne)
Ninf (産総研(旧電総研)/ 東工大)
DataFarm (高エネ研/ 産総研(旧電総研) /東工大/東大）

キャンパス内Grid構築プロジェクト

Millennium (UC Berkeley)
SinRG (U. Tennessee)

→キャンパス内クラスタ分散による学内Grid構築という点では類似しているが、本要求は数十倍の計算およびストレッジを

E-Scienceに提供

Titech Grid全体では学内のE-Scienceに学内スパコン

の100倍の資源をGridおよびPCクラスタ技術で提供

30TeraFlops級の計算力, Petabyte

級のストレッジ

→我が国のインターネットが本学で始まったように、

Gridインフラにおいても国際的な COEへ

→ 本学を新たなE-Scienceの拠点へ

Truck

SuperSINET等で

学外Gridとの連携

TeraFlops, Tera~ PetaBytes, Giga~ TeraBpsを

要求する新世代のE-Scienceのアプリケーション

高エネルギー物理(ペタバイトデータ解析)
天文学、地球物理学、惑星物理学
ゲノム情報、蛋白質構造決定、分子レベル創薬
ナノテクノロジー、分子コンピュータ設計
災

害シミュレーション →従来型のスパコンの多少のインクリメンタルな増強では扱いが不可能(スパコンでは不得意、

C/Pが非現実的)

SLIDE 40

PRESTO and Titanium Clusters Peak Performance Scaling

0.7 2 22 64 104 205 500 1000 10000 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04

S p a r c C l u s t e r ( 1 9 9 6 ) R W C C l u s t e r ( 1 9 9 7 ) P r e s t

I

( 1 9 9 8 ) P r e s t

I

I ( 1 9 9 9 ) P r e s t

I

I u p g . 1 H 2 K P r e s t

I

I u p g . 2 H 2 K P r e s t

T
t

a l P r e s t

I

V C a m p u s G r i d

GigaFLOPS

SLIDE 41

Application Examples: Distributed Supercomputing

SF-Express Distributed Interactive Simulation: Caltech, USC/ ISI

Starting point: SF-Express

parallel simulation code

Globus mechanisms for

Resource allocation Distributed startup I/O and configuration Fault detection

100K vehicles using 13

computers, 1386 nodes, 9 sites (5 years early!)

NCSA Origin Caltech Exem plar CEW ES SP Maui SP

SLIDE 42

アプリケーション例: Operations

Research

ORの問題⇒高次, Non-Polynomial

Complexity, 大規模組合せ最適化

Cluster,Grid計算によるORのアク

セス容易化、大規模・困難問題の挑戦

NEOS Project

（

Argonne National Lab)

本研究室と

東工大小島研究室との共同研究

SLIDE 43

NEOSシステム

Argonne 国立研の

Applied Math Group

最適化問題の“Portal”

サイト

ソ

ルバを選択し、メイルやSubmission tool で問題を送付、リモートに実行して結果を得ることが可能

SLIDE 44

Ninf Applications to OR (東工大小島研と

の共同研究)

Use Ninf system and

Presto Clusters on Grid Testbed

SCRM(Generalized

Quadratic Optimization Algorithm)

Solves Non-Convex

Optimization Problems

APANTokyo RWCP TITECH

TransPAC 100Mbps

AIST/TACC

NWS Sensors Virutal/Real Client

ApGrid ApGrid Testbed Testbed

NWS Sensors Virutal/Real Client Virutal/Real Client NWS Sensors

Kyoto-U

SLIDE 45

Ninf Grid Applications to OR(2)

SCRM(Generalized

Quadratic Optimization Algorithm)

Iterative execution of

multiple Semi-Definite Programming solver w/Ninf via Master- Worker

Some problems 100Fold

speedup/128 procs (exec. Time world record)

P R E S T O クラスタによる非凸二次計画問題のS C R M 法による並列実行 1 2 3 4 5 6 7 8 1 2 4 8 1 6 3 2 6 4 # P r

c

e s s

r

s 実行時間 ( 秒 ) N Q P 1 5 _ 1 . d a t N Q P 1 2 _ 1 . d a ｔ

SLIDE 46

Ninf Application to OR - Results

S

C R M 法( K

j

i m a

T

u n c e l

T

a k e d a ) による大規模非凸二次計画問題解法: 大規模クラスタ計算機( 1 2 8 プロセッサ) ⇒世界記録達成

H
m
t
p

y 法によるC y c l i c P

l

y n

m

i a l E q u a t i

n

の解法: 大規模クラスタ計算機 ( 1 2 8 プロセッサ) 1 3 次⇒世界記録達成

実用的なO

R 問題

SLIDE 47

Mcell: Collaboration with UCSD

Salk Institute and CMU General simulator for cellular

microphysiology

Revolutionary disciplinary results 3-D Monte-Carlo simulation Embarrassingly parallel with

file sharing.

Small-scale runs
Medium-scale runs
Large-scale runs:
> 500,000 tasks
Terabytes of data

SLIDE 48

Mcell (Cont’d)

Input files (e.g. 150MB) Tasks Output file 3-D Rendering Post-processed Output

Use Ninf-Netsolve for Large Application Run over the Pacific (SDSC ASCI Blue Horizon, ORNL cluster, PRESTO clusters)

SLIDE 49

Mcell APST Validation Experiments

University of Tennessee, Knoxville

NetSolve + IBP

University of California, San Diego

GRAM + GASS

Tokyo Institute of Technology

NetSolve + NFS NetSolve + IBP

APST Daemon APST Client

PRESTO Cluster

Pacific Ocean

SLIDE 50

TELESCIENCE: TELESCIENCE:

Richly Integrated, End-to-End System Richly Integrated, End-to-End System

SLIDE 51

Imaging Instrument Supercomputing Large-scale Databases Scanning, Acquisition Reconstruction, Segmentation Visualization, Measurement

Network GLOBUS - GRID

NCMIR NCMIR Telemicroscopy Telemicroscopy

Remote Access for Data Acquisition and Analysis Remote Access for Data Acquisition and Analysis

Data Sizes - now M-JPEG Video: 80 Kbps Digital Video: 36 Mbps HiRes Image: 16 MB Tilt Series: 1936 MB Raw Volume: 8 GB Refined Volume: ???

SLIDE 52

Tokyo XP Tokyo XP (Chicago) (Chicago) STAR STAR TAP TAP

TransPAC TransPAC vBNS vBNS

(San Diego) (San Diego) SDSC SDSC NCMIR NCMIR (San Diego) (San Diego)

UCSD UCSD

UHVEM UHVEM (Osaka, Japan) (Osaka, Japan)

CRL/MPT CRL/MPT

NCMIR NCMIR (San Diego) (San Diego) UHVEM UHVEM (Osaka, Japan) (Osaka, Japan)

1 1st

st

2 2nd

nd

Globus

APAN Trans-Pacific Telemicroscopy Collaboration, Osaka-U, UCSD, ISI

(slide courtesy of Mark Ellisman@UCSD)

SLIDE 53

Computer meets Medicine

Osaka-U/iHPC Singapore Example MEG (Magnetoencephalography) Dr. Nogawa(Osaka-U)

SLIDE 54

Visualization Module Neurologist (specialist in brain disease) Data Acquisition Module Supercomputer Cybermedia Center Osaka University & Singapore iHPC Computation Module MEG A hospital in your city

System Overview System Overview

C T F S y s t e m s I n c .

Globus Globus

networked virtual supercomputer

MPI MPI

Parallel computing

Wavelet Analysis Wavelet Analysis

Advanced Signal Processing

SLIDE 55

Grid Collaboration Environment for MEG

iHPC

Dual Pent ium I I I 500MHz x

7 nodes Osaka Universit y

EV56 500MHz x 8
Pent ium I I I 550 MHz x 1

Globus

SLIDE 56

Petascale Data Intensive Computing / Large-scale Data Analysis

Data intensive computing, large-scale data

analysis, data mining

High Energy Physics (e.g. CERN LHC) Astronomical Observation, Earth Science Bioinformatics… Good support still needed

Large-scale database search, data mining

E-Government, E-Commerce, Data warehouse Search Engines Other Commercial Stuff

SLIDE 57

Example: Large Hadron Collider Accelerator at CERN and the ATLAS Detector

Tr uck

ATLAS Det ect or 40mx20m 7000 Tons

LHC Perimet er 26.7km

Ot her Det ect ors e.g. CMS (4 Tot al)

~ 2000 physicists from 35 countries

SLIDE 58

~1P B/ year (1MB/ event 30MB/ sec) ~1P B/ year ~300TB/ year 100KB/ event ~10TB/ year 10KB/ event

CERN/ATLAS High Energy Physics Data Analysis – PetaScale Online Data Processing

magnetic-field reconstruction algorithm tracker-2 reconstruction algorithm

RAW

tracker-1 digits tracker-2 digits

Event

calorimeter-1 digits calorimeter-2 digits magnet-1 digits

REC

tracker-1 position info tracker-2 position info

Event

magnet-1 field calorimeter-1 energy calorimeter-2 energy track reconstruction algorithm cluster reconstruction algorithm

ESD

track-1

Event

cluster-1 tracker-1 reconstruction algorithm calorimeter-2 reconstruction algorithm calorimeter-1 reconstruction algorithm cluster-2 cluster-3 track-2 track-3 track-4 track-5 jet identification algorithm electron identification algorithm

AOD

jet 1

Event

electron1 photon1 electron2 jet 2 Et miss Et miss identification algorithm

SLIDE 59

Grid-based Peta/Exascale Data Intensive Computing Requirements

Peta/ Exabyte scale files Scalable parallel I/O throughput World-wide group-oriented authentication

and access control

Resource Management and Scheduling Data sharing and efficient access Program sharing System monitoring and administration Fault Tolerance / Dynamic re-configuration Global Computing Environment

SLIDE 60

Design of the Grid Data Farm (KEK、

産総研、東大、東工大など)

Based on Grid technology & PC Cluster tech.

Pet a-t o-Exascale Global Filesyst em Par allel I / O and par allel pr ocessing

SLIDE 61

CRL Tokyo Inst. Tech. Osaka-U, NAIST CERN KEK JGN Tokyo JGN Osaka JGN Tsukuba

JGN JGN JGN ANL UIC

JGN

STARTAP

IMnet

JGN ATM link Network Image

Internet

AIST

Gfarm: Grid Testbed

SLIDE 62

Gfarm Development Cluster (Prototypes) at Titech

Prototype 1 – Presto III Cluster (1/50th

scale) @ Titech

AMD Athlon 1.33 Ghz x 128 nodes 768MB mem, 200GB HDD/node

100GB mem, 25TB HDD total 300 Gflops Peak, Myrinet 2K

78 nodes currently operational AMD Press Release Today

Protoype 2 – Presto IV (1/20th scale)

Design Mostly Done

0.13micron Athlon, 2Ghz x 128 Dual Nodes 400GB Disk/Node, 50 TB total 1 TFlops Peak, Myrinet 2K

Operational by 1Q2002@Titech (our lab) Domestic and International Data Challenges

SLIDE 63

Initial Design of the Production Gfarm Cluster in 2005

Assume Infiniband or similar technology In-box CPUs and Disks locally interconnected

via Infiniband

Inter-box connection via Infiniband, direct

interconnect into local fabric

High-Density disk packaging required, cooling

a big problem, need engineering development

Somewhat different from business web server

technology due to high computing and I/O capacity requirement

SLIDE 64

Initial Design of the Production Gfarm Cluster (2)

Design of Gfarm Node

Commodity Technology circa 2005 300GByte low power HD Drive, Raid 5, 25 Drives/box

= > 6 Terabytes/box (Plug&Play, Active Cooling)

> 10GigaFlop SMT 64-bit CPUx4-8, > 20GB RAM Multi-channel, Multi-gigabit LAN, > 10GigaBps 4U box, 600W power/box, Active cooling

Design of Gfarm System (2004-5 production)

60TeraBytes@250 disks, 40CPUs/40U chassis, 5KWatts 20 Chassis, 1.2 PetaBytes@5000HDDs, 8-16Teraflops

@800CPUs, 100KWatts, 3 Petabyte Tape Storage

Direct Infiniband link into the WAN fabric

SLIDE 65

Grid Data Farm Development Schedule

Initial Prototype 2000-2001

Gfarm filesystem, Metadata management,

data streaming and GridRPC

Mock Data Challenge (Monte-carlo) Deploy on Development Gfarm Cluster

Second Prototype 2002(-2003)

Load balance, Fault Tolerance, Scalability Accelerate by National “Broadband

Computing initiative” proposal

Full Production Development (2004-2005

and beyond)

Deploy on Production GFarm cluster Petascale online storage

Synchronize with ATLAS schedule

ATLAS-Japan Tier-1 RC “prime customer”

5km KEK ETL/ TACC

Today 135Mbps P lanned 10xN Gbps U-Tokyo (60km) TI TECH (80km)

Super SI NET Tsukuba WAN

SLIDE 66

Summary of GFarm

Petabyte-scale Data Intensive Computing wave

f computational science

Poses extreme challenges to HPC, current HPC

solutions not well applicable

Grid and Commodity Cluster technology as viable

solutions

Existing Grid infrastructure can be utilized, but

further research and development required

Grid Data Farm builds on success of Ninf (and

ther Grid projs. such as Globus) to cope with

such challenge

SLIDE 67

Growing Interest for the Grid in Japan and Asia-Pacific

Various Collaboration Successes

HE Electron Microscope (Osaka-U/UCSD) Remote Magnetoencephalography (Osaka-

U/iHPC Singapore)

Operations Research (TITECH/Kyoto-U) ATLAS-Gfarm

Interest Growing Rapidly

Astronomy, Subaru and Bisei Telescopes

(NAO, CRL)

Lunar Exploration, SELENE (NASDA) Earthquake Measurement (Bosai) Genome Informatics (Riken, JAIST, etc.)

Tsukuba WAN/ONE Meeting

Over 250 participants

The Next Big National Project…

Observat ories Mecca in Hawaii

VLBI: Kashima 34m telescope

3D eart hquake simulat or in MI KI

SLIDE 68

N

４

T s u k u b a W A N r

u

t e m a p

１０１１９６７５３２８１２１

Total perimeter： 46.4km

STA/IM net KEK NTT AS lab NRED Tsukuba Univ. STACI Material

AIST NIES Maffine

ULIS TAO/JGN

Grid Networking Infrastructure: Tsukuba WAN

Tens of National Labs in Tsukuba.
Six supercomputers ranked within

100th in TOP500

330Gbps DWDM ring
Testbed for Grid infrastructure and

applications

SLIDE 69

64bit CPU Massive HDD

Giga Ethernet

Myrinet

National Broadband Computing Initiative

Interconnection Clust er component CP U+ Disk

Federat ion of Mult iple Pet aScale Clust ers over Grid int ernet 10Gbps-Tbps (Super SI NET) Aut omat ed Dist ribut ion I nf iniband DWDM “Post -Clust ers”

Distributed Processing Of PetaBytes of Data

ver WAN

SLIDE 70

Asia-Pacific Grid ApGrid (www.apgrid.org)

Common Framework

for Asia-Pacific Grid researchers

Represent AP

interests to GGF

Collaborate with

APAN/ TransPAC

Voluntary framework:

Not a project funded from single source

North America (STARTAP) Europe

South Korea

Japan China Hong Kong Malaysia Singapore Indonesia

Australia

Philippines

TransPAC (100 Mbps)

Latin

America Europe Thailand

SLIDE 71

ApGrid Grid Services and Testbed

Provide “Free” Grid

Resources as Testbed

Port and Provide Various

Grid Services

Ninf/ GridRPC/Netsolve,

Gfarm, Globus, NWS, PACX, Stampi, RealGrid, APST, Legion, Cactus…

Collaboration from

Japanese SC companies

Would like to collaborate

w/other Grid partners

APANTokyo KEK Utokyo TITECH

TransPAC 100Mbps

ETL/TACC

STAR TAP Chicago

ApGrid － Korea, Singapore, Australia, etc,

ApGrid nodes in Japan

NWS Sensors Virutal/Real Client

ApGr id ApGr id Test bed Test bed

NWS Sensors Virutal/Real Client Virutal/Real Client NWS Sensors

US and European Partners Kyoto-U

SLIDE 72

ApGrid Locations/Potential Partners

Japan

AI ST/ Tsukuba Advanced Computing

Center

Universities

TI T, Kyushu, Kyoto Waseda, Osaka,

Tsukuba

Computing Center, labs

KEK ( Gfarm) Real World Computing Partnership Other Govt. Labs, Universities. NEC, Fujitsu, Hitachi, …

Australia

ANU/ APAC, Monash U

United States

PNNL, (other labs and centers?)

Korea

KI STI : Korea I nstitute of

Science and Technology I nformation

Planning to have a link to

Europe

Thailand

NECTEC: National Electronics and

Computer Technology Center

Taiwan

NCHC: National Center for High-

Performance Computing

Potential Asian Partners

Singapore (NUS)
Malaysia
ROC
Hong Kong

Other APAN members

SLIDE 73

APGrid System resources

From Various Computing

Centers, Labs

Vector Supercomputers

SR8000, SR2201, VPP, SX

SMP/ NUMAs/MPPs

IBM SPs, Origin 2K, Sun Enterprise

Severs

Large-Scale Clusters

Several 100 procs each

I will be submitting 256 procs

Federation of Grid clusters Multiple 1000s node, Terascale

clusters in 2001-2002

SLIDE 74

APGrid Locations/Potential Partners

Japan

AI ST/ Tsukuba Advanced

Computing Center/ ETL

Kyushu University

Computing Center, labs

Kyoto-U (Several labs) Waseda U, Osaka-u, KEK (DataGrid) Other Govt. Labs, Universities.

Australia

ANU, Monash U

United States

PNNL, (other labs and centers?)

Potential Asian Partners

Korea (KORDI C, ) Singapore (NUS) Malaysia Thailand ROC, Hong Kong, Taiwan Other APAN members

SLIDE 75

まとめ(1)

HPCは変革期を迎えている

Grid Computing and Cluster Computing 高性能ネッ

トワーキングとの融合

よ

り普遍で汎用の技術 “Commodity化”、スパコンの終焉

莫大な計算/データ

資源がネットワークを様々な分野で活用可能

科学技術において、

従来は困難だった問題を解決

ネッ

トワーク高性能計算科学をさらに普遍的に

し

かし、情報技術は先進的技術を使いこなす必要あり

SLIDE 76

まとめ(2)

Grid テクノ

ロジ: ネットワーク上の超広域高性能計算インフラ

最先端の超広域の計算イ

ンフラテクノロジ

ネッ

トワークテクノロジとコンピューティングテクノロジの融合

計算科学による新たな科学技術の進歩、

パラダイムの変化

Collaborationの必要性

HPC・

言語実装・

OS・

分散システム

ネッ

トワーキング・インターネット

アプリ

ケーション科学者

http://matsu-www.is.titech.ac.jp,

http:// ninf.apgrid.org

Introduction to Grid Computing

東京工業大学 学術国際情報センタ

理・ 計算科学専攻/JST 松岡 聡

従来の高性能計算(HPC)

ト ル ベク ト ル(

並列)

ン スパコ ン

単一の計算機を多人数で共有

極少数のユーザ

ロジ・ アーキテクチャ 特殊なテクノ ロジ・ アーキテクチャ

い 使いにく い(

近年はUnix

も

チシステム バッ チシステム

ュ レ ーショ ンが主流 物理シミ ュ レ ーショ ンが主流

新し い高性能計算の潮流 新し い高性能計算の潮流 （ （

New HPC Paradigms New HPC Paradigms）

）

「

コモディ ティ 化]

ク

ラ ス タ 計算 (Cluster Computing)

グローバル計算 (Grid Computing) 両者と

も 、

汎用、 普及の可能性

21世紀にはHPCの主流に 広範囲なHPCの普及へ

イ ン タ ーネッ ト の利用法

ール、 遠隔会議、

など

ベース

など

計算」 に関する標準的な基盤はまだない ⇒グローバルコ ン ピ ュ ーティ ング、 “The Grid”

グローバルコ ン ピ ュ ーティ ング ( G r i d ) へ

イ

ン タ ーネッ ト 上の超分散並列 「 仮想並列コ ン ピ ュ ー タ 」 による計算資源の共用

計算イ

ン フ ラ ・

ン フ ラ ・

ン フ ラ

ベクタ ベクタ/ / MPP MPP ( ( 超並列計算機 超並列計算機) )

Intranet+ Intranet+ Internet Internet

高速ネットワークによる 高速ネットワークによる ワークステーションクラスタ ワークステーションクラスタ

エンドユーザ エンドユーザ

グローバルコ ン ピ ュ ーティ ング へ( 2 )

広域のネッ

ト ワーク 上の高性能 分散計算

仮想的な大規模並列計算機

次世代のイ

ン タ ーネッ ト の 計算にまつわるソ フ ト ウェ ア基盤

グローバルコ ン ピ ュ ーティ ングの 応用分野

ゆる科学技術の分野

計算化学、 生物情報、 医療情報、 宇 宙科学、 材料系、 原子力、 機械工学、 オペレーショ ンズリ サーチ、 などなどなど

大容量リ モート データ 処理、 リ モート センシングの融合

ング用の専用並列計算機の置き換 え

The Grid: The Web on Steroids

Assignment Problem w/MW

Exact solution of “nug30” quadratic

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13

Used “MW” framework that maps branch and

Condor-G delivered 3.46E8 CPU seconds in 7

NUG30のGrid上での実行

Condorシステム

MWラ

イ ブ ラ リ

世界規模でMPPスパコ

ンから ク ラ ス タ 、 机上の 未使用のWorkstation まで活用

従来分散コ ン ピ ュ ーティ ング技術 と

Grid

従来の分散コ

ン ピ ュ ーティ ング技術

どれも

ン ピ ュ ーティ ン グ)には適し ていない

グローバルコ ン ピ ュ ーティ ングの 技術的要請

拡張性 セキュ

リ テ ィ

対故障性 あら

ゆる

世界の現状: 各種Gridプロジェ ク ト

GUSTO Computational Grid

Emerging Production Grids

東京工業大学学術国際情報センタ

理・計算科学専攻/JST 松岡聡

トルベクトル(

ンスパコン

ロジ・アーキテクチャ特殊なテクノロジ・アーキテクチャ

い使いにくい(

チシステムバッチシステム

ュレーションが主流物理シミュレーションが主流

新しい高性能計算の潮流新しい高性能計算の潮流（（

コモディティ化]

ラスタ計算 (Cluster Computing)

も、

汎用、普及の可能性

21世紀にはHPCの主流に広範囲なHPCの普及へ

インターネットの利用法

ール、遠隔会議、

計算」に関する標準的な基盤はまだない ⇒グローバルコンピューティング、 “The Grid”

グローバルコンピューティング ( G r i d ) へ

ンターネット上の超分散並列「仮想並列コンピュータ」による計算資源の共用

ンフラ・

ンフラ・

ンフラ

ベクタベクタ/ / MPP MPP ( ( 超並列計算機超並列計算機) )

高速ネットワークによる高速ネットワークによるワークステーションクラスタワークステーションクラスタ

エンドユーザエンドユーザ

グローバルコンピューティングへ( 2 )

トワーク上の高性能分散計算

ンターネットの計算にまつわるソフトウェア基盤

グローバルコンピューティングの応用分野

計算化学、生物情報、医療情報、宇宙科学、材料系、原子力、機械工学、オペレーションズリサーチ、などなどなど

大容量リモートデータ処理、リモートセンシングの融合

ング用の専用並列計算機の置き換え

イブラリ

ンからクラスタ、机上の未使用のWorkstation まで活用

従来分散コンピューティング技術と

ンピューティング技術

ンピューティング)には適していない

グローバルコンピューティングの技術的要請

拡張性セキュ

リティ

対故障性あら

世界の現状: 各種Gridプロジェクト

クト

トワークは通常ネットワーク

クト

フトウェアインフラ

ケーション

クト

フトウェア、ツール

普及、標準化など

して設立(IETFに習う

我々のNinf (GridRPC)プロジェクト我々のNinf (GridRPC)プロジェクト

クトメンバー

クトメンバー