Development of e-Science Application Framework Eric Yen, Simon C. - - PowerPoint PPT Presentation

development of e science application framework
SMART_READER_LITE
LIVE PREVIEW

Development of e-Science Application Framework Eric Yen, Simon C. - - PowerPoint PPT Presentation

Development of e-Science Application Framework Eric Yen, Simon C. Lin & Hurng-Chun Lee ASGC Academia Sinica Taiwan 24 Jan. 2006 LCG and EGEE Grid Sites in the LCG Asia-Pacific Region 4 LCG sites in Taiwan IHEP Beijing IHEP Beijing


slide-1
SLIDE 1

Development of e-Science Application Framework

Eric Yen, Simon C. Lin & Hurng-Chun Lee ASGC Academia Sinica Taiwan 24 Jan. 2006

slide-2
SLIDE 2

last update 01/11/06 04:29 AM

LCG

LCG and EGEE Grid Sites in the Asia-Pacific Region

4 LCG sites in Taiwan 12 LCG sites in Asia/ Pacific Academia Sinica Grid Computing Centre

  • - Tier-1 Centre for the LHC

Computing Grid (LCG)

  • - Asian Operations Centre

for LCG and EGEE

  • - Coordinator of the

Asia/Pacific Federation in EGEE

LCG site

  • ther site

PAEC NCP Islamabad IHEP Beijing KNU Daegu

  • Univ. Melbourne

GOG Singapore KEK Tsukuba ICEPP Tokyo Taipei - ASGC, IPAS NTU, NCU VECC Kolkata Tata Inst. Mumbai

AP Federation now shares the e-Infrastructure with WLCG

LCG site

  • ther site

PAEC NCP Islamabad IHEP Beijing KNU Daegu

  • Univ. Melbourne

GOG Singapore KEK Tsukuba ICEPP Tokyo Taipei - ASGC, IPAS NTU, NCU VECC Kolkata Tata Inst. Mumbai

slide-3
SLIDE 3

SC Workshop, Taipei, 30-31 Oct 2005 Jason Shih, ASGC

Perspectives of ASGC, as Tier-1

 Help trouble shooting of Tier-2s on various of

services/functionality before involving in SC (IS, SRM, SE etc)

 Reaching persistent data transfer rate  Increasing reliability and availability of T2

computing facilities, sort of stress testing

 Close communication with T2s, T0 and T1s  Gaining experiences before LHC experiments

begins

slide-4
SLIDE 4

2005/12/16 Simon C. Lin / ASGC

Plan of AP Federation

 VO Services: deployed from April 2005 in Taiwan (APROC)  LCG: ATLAS, CMS  BioInformatics, BioMed  Geant4  APeSci : for collaboration general e-Science services in Asia

Pacific Areas

 APDG: for testing and testbed only  TWGRID: established for local services in Taiwan  Potential Applications  LCG, Belle, nano, biomed, digital archive, earthquake, GeoGrid, astronomy, Atmospheric Science

slide-5
SLIDE 5

SC Workshop, Taipei, 30-31 Oct 2005 Jason Shih, ASGC

Plans for T1/T2

 T1-T2 test plan

 what services/functionality need to test  recommendation for T2 sites, checklist  What have to be done before join SC  Communication methods, and how to improve if needed  Scheduling of the plans, candidates of sites  Timeline for the testing  SRM + FTS functionality testing  Network performance tuning (jumbo framing!?)

 T1 expansion plan

 Computing power/storage  storage management, e.g. CASTOR2 + SRM  Network improvement

slide-6
SLIDE 6

Enabling Grids for E-sciencE

INFSO-RI-508833

Enabling Grids for E-SciencE

  • >200 sites
  • >15 000 CPUs (with peaks >20 000 CPUs)
  • ~14 000 jobs successfully completed per day
  • 20 Virtual Organisations
  • >800 registered users, representing 1000s of scientists

A Worldwide Science Grid

slide-7
SLIDE 7

EGEE Asia Pacific Services by Taiwan

 Production CA Services  AP CIC/ROC  VO Support  Pre-production site  User Support  MW and technology development  Application Development  Education and Training  Promotion and Outreach  Scientific Linux Mirroring and Services

slide-8
SLIDE 8

APROC

 Taiwan acts as Asia Pacific CIC and ROC in

EGEE

 APROC established in early 2005

 Supports EGEE sites in Asia Pacific  Australia, Japan, India, Korea, Singapore,

Taiwan

 8 sites, 6 countries

 Provides Global and Regional services

slide-9
SLIDE 9

APROC EGEE Wide Services

 GStat

 Monitoring Application to check health of Grid Information System  http://goc.grid.sinica.edu.tw/gstat/

 GGUS Search

 Performs Google search targeted at key Grid knowledge bases

 GOCWiki

 Hosted Wiki for User and Operations related FAQ and Guides

slide-10
SLIDE 10

APROC EGEE Regional Services

 Site Registration and Certification

 Monitoring and Daily Operations  Problem diagnosis, tracking and troubleshooting

 Middleware certification test-bed

 New release testing, supplemental documentation

 Release support and coordination

 updates, upgrades and installation

 Security coordination

 With Operational Security Coordination Team (OSCT)

 VO Services

 CA for collaborators in Asia-Pacific  VOMS, LFC, RB, BDII support for new VO in region

 Support Services

 Web portal and documentation  User and Operations Ticketing System

slide-11
SLIDE 11

Education and Training

Note: gLite and the development of EGEE were introduced in all the events which are run by ASGC Event Date Attendant Venue China Grid LCG Training 16-18 May 2004 40 Beijing, China ISGC 2004 Tutorial 26 July 2004 50 AS, Taiwan Grid Workshop 16-18 Aug. 2004 50 Shang-Dong, China NTHU 22-23 Dec. 2004 110 Shin-Chu, Taiwan NCKU 9-10 Mar. 2005 80 Tainan, Taiwan ISGC 2005 Tutorial 25 Apr. 2005 80 AS, Taiwan Tung-Hai Univ. June 2005 100 Tai-chung, Taiwan EGEE Workshop

  • Aug. 2005

80 20th APAN, Taiwan

slide-12
SLIDE 12

24 Jan. 2006 21st APAN, Japan

Service Challenge

slide-13
SLIDE 13

2005/11/20 Min-Hong Tsai / ASGC

 GOC APEL Accounting  excluding non-LHC VO (Biomed)

ASGC Usage

13

slide-14
SLIDE 14

2005/11/20 Min-Hong Tsai / ASGC

SRM Services

 Increase to four pool nodes for more parallel Gridftp transfers  SRMCP’s stream and TCP buffer option did not function  Work around by configuring SRM server  And transfer rate can reach 80MB/s, the average is 50MB/s.

14

slide-15
SLIDE 15

Atlas SC3 DDM - ASGC VOBOX

Average throughput per day (01/18, 2006)

Latest update can be found at: http://atlas-ddm-monitoring.web.cern.ch/atlas-ddm-monitoring/all.php

Total cumulative data transferred (01/18, 2006)

slide-16
SLIDE 16

Tasks/deliverables

Batch services

 Deliver prod quality batch services  Frontline consultancy and support for batch scheduler  Customized tool suites for secure and consistent management

Manage hierarchical storage

 Prod quality DM services, including planning, procurement, and

  • peration both for SW and HW

 Meet data transfer rate requirement declared in MoU  OP experiences/procedures sharing with Tier-1s, 2s  HA + L/B

Middleware support

 Frontline consultancy and support for other tiers in tweaking

configurations, trouble shooting, and maintenance procedures.

 Certification testing for pre-released tag of LCG  Installation guide/note if lack from official release  Training courses

slide-17
SLIDE 17

ARDA

 Goal: Coordinate to prototype distributed analysis

systems for the LHC experiments using a grid.

 ARDA-ASGC Collaboration: since mid 2003

 Building push/pull model prototype(2003)  Integrate Atlas/LHCb analysis tool to gLite(2004)  Provide first integration testing and usage document on Atlas tools:Dial

(2004)

 CMS monitoring system development (2005)

 Monitoring system to integrate RGMA & MonaLisa  ARDA/CMS Analysis Prototype: Dashboard

 ARDA Taiwan Team: http://lcg.web.cern.ch/LCG/activities/arda/team.html  4 FTEs participated: 2 FTEs at CERN, the other 2 are in Taiwan

slide-18
SLIDE 18

24 Jan. 2006 21st APAN, Japan

mpiBLAST-g2

slide-19
SLIDE 19

mpiBLAST-g2 ASGC, Taiwan and PRAGMA http://bits.sinica.edu.tw/mpiBlast/index_en.php

A GT2-enabled parallel BLAST runs on Grid

GT2 GASSCOPY API

MPICH-g2

The enhancement from mpiBLAST by ASGC

  • Performing cross cluster scheme of job

execution

  • Performing remote database sharing
  • Help Tools for

– database replication – automatic resource specification and job submission (with static resource table) – multi-query job splitting and result merging

  • Close link with mpiBLAST development

team

– The new patches of mpiBLAST can be quickly applied in mpiBLAST

  • g2
slide-20
SLIDE 20

28 April, 2005 ISGC 2005, Taiwan

SC2004 mpiBLAST-g2 demonstration

KISTI

slide-21
SLIDE 21

mpiBLAST-g2 current deployment

  • - From PRAGMA GOC http://pragma-goc.rocksclusters.org
slide-22
SLIDE 22

mpiBLAST-g2 Performance Evaluation (perfect case)

Elapsed time Speedu p

Database: est_human ~ 3.5 GBytes Queries: 441 test sequences ~ 300 KBytes

  • Overall speedup is approximately linear

— Searching + Merging — BioSeq fetching — Overall

slide-23
SLIDE 23

mpiBLAST-g2 Performance Evaluation (worse case)

Elapsed time Speedu p

Database: drosophila NT ~ 122 MBytes Queries: 441 test sequences ~ 300 KBytes

  • The overall speedup is limited by the unscalable

BioSeq fetching — Searching + Merging — BioSeq fetching — Overall

slide-24
SLIDE 24

Summary

Two grid-enabled BLAST implementations (mpiBLAST

  • g2 and DIANE-

BLAST) were introduced for efficient handling the BLAST jobs on the Grid

Both implementations are based on the Master-Worker model for distributing BLAST jobs on the Grid

The mpiBLAST

  • g2 has good scalability and speedup in some cases

 Require the fault-tolerance MPI implementation for error recovery  In the unscalable cases, BioSeq fetching is the bottleneck

DIANE-BLAST provides flexible mechanism for error recovery

 Any master-worker workflow can be easily plugged into this framework  The job thread control should be improved to achieving the good performance

and scalability

slide-25
SLIDE 25

25

slide-26
SLIDE 26

24 Jan. 2006 21st APAN, Japan

DataGrid for Digital Archives

slide-27
SLIDE 27

Data Grid for Digital Archives

27

slide-28
SLIDE 28

Long-Term Archives for AS NDAP Contents

Project Totel Files Total Size (MB) 珍藏歷史文物 3,353 4,495,853.22 管理員 1,095 981.33 台灣貝類相 3,878 21,869.78 近代中國歷史地圖與遙測影像資訊典藏計畫 33,671 364,554.69 語言典藏計畫 1 7.05 技術研發分項計畫 39,315 98,246.45 魚類資料庫 32,070 4,199.32 台史所 34,040 44,157.20 台灣本土植物 31,027 1,578,654.76 近代外交經濟重要檔案計畫 603,997 20,601,428.38 台灣原住民 601,715 1,516,242.05 1,384,162 28,726,194.23

Table I. Size of Digital Contents of NDAP 2002 2003 2004 2005 Total Total Data Size (GB) 22,810.00 38,550.00 63,480.00 70,216.02 195,056.02 AS Production (GB) 22,800.68 31,622.17 47,430.79 55,757.47 157,611.11 Table II. Details of NDAP Production in 2005 Metadata Size(MB) Metadata Records Data Size(GB) All Inst. 56,204.40 1,035,538.00 70,216.02 AS 53,434.13 763,431.00 55,757.47

slide-29
SLIDE 29

24 Jan. 2006 21st APAN, Japan

Atmospheric Sciences

slide-30
SLIDE 30

Linux

Atmosphere Databank Architecture

NCU/databank NTNU/dms

MCAT / srb001 Users

ASCC/srb002 ASCC/lcg00104(TB) ASCC/lcg00105(TB) NTU/monsoon(TB) NTU/dbar_rs1, dbar_rs_2 ASCC/gis252(TB) ASCC/gis212(TB) windows Command Lines Applications Portal/Web Client

slide-31
SLIDE 31
  • Use LAS (Live Access Server) to access the dataset from the

SRB System, and integrate with Google Earth

slide-32
SLIDE 32

Industrial Program

  • NSC-Quanta Collaboration

– To help Quanta Blade System have best performance for HPC and Grid Computing – Quanta is the largest Notebook manufacturer in the world – Participants: AS, NTU, NCTS, NTHU, NCHC – Scientific Research Disciplines: Material Science, Nano-Technology, Computational Chemistry, Bioinformatics, Engineering, etc. – Performance Tuning, Grid Benchmarking

slide-33
SLIDE 33

24 Jan. 2006 21st APAN, Japan

e-Science Application Framework

slide-34
SLIDE 34

Grid Application Architecture

 Layered architecture to

improve the usability of Grid applications

 Two frameworks built on

the top of current Grid middleware to

 provide friendly graphic user

interface

 handle Grid application logic in

an efficient way

 reduce the efforts of

application gridification

slide-35
SLIDE 35

Grid Application Logic Framework

 An integration of the tools for handling different

Grid application computing models (i.e. application logics)

 Through which different application logics are

handled in an efficient way on the Grid

 stability  scalability  performance

slide-36
SLIDE 36

Grid Application User Interface Framework

 A container-like framework in which one can easily

build an application oriented graphic user interface to interact with the application logics running on the Grid

 A set of API for graphic Grid application user

interface development

slide-37
SLIDE 37

Grid Application Portal (GAPortal)

slide-38
SLIDE 38

Grid Application Portal (cont.)

 A prototype of the Grid application user interface framework

based on the NRPGM BioPortal

 A web-based portal of Grid applications  Grid-enabled components

 Grid proxy management  proxy delegation (from web-browser side to the Grid worker node)  proxy renewal (integration with MyProxy)  Grid platform adapter  LCG/EGEE platform  GT

  • based Grid
slide-39
SLIDE 39

DIANE & GANGA

 DIANE is a “skeleton” of efficient handling master-worker type

application logic on distributed computing environment

 GANGA is a light-weight framework aiming to distribute HEP

analysis tasks on heterogeneous computing environments

 DIANE + GANGA = a framework of efficient handling master-

worker type application on the Grid

 A collaboration with LCG-ARDA team at CERN

slide-40
SLIDE 40

DIANE/GANGA framework

Key features

 Automatic load balancing by task pull model  Worker health detection  User defined fail recovery mechanism

slide-41
SLIDE 41

Summary

 Production Grid Application Environment has been

available

~95% system reliability

Test job successful rate > 90%

 Application Driven is the best policy for e-Science

infrastructure development.

 The success of e-Science lies on the worldwide

collaboration for e-infrastrucuture, applicaitons, actively participation, and mutual beneficial.