Mark Bartelt
Center for Advanced Computing Research California Institute of Technology mark @ cacr.caltech.edu http: / / www.cacr.caltech.edu/ ~ mark
Mark Bartelt Center for Advanced Computing Research California - - PowerPoint PPT Presentation
Mark Bartelt Center for Advanced Computing Research California Institute of Technology mark @ cacr.caltech.edu http: / / www.cacr.caltech.edu/ ~ mark Hype? Or Buzzword? Grid Computing: Future Directions Current Status History History
Mark Bartelt
Center for Advanced Computing Research California Institute of Technology mark @ cacr.caltech.edu http: / / www.cacr.caltech.edu/ ~ mark
(SDSC)
)
Applications (NCSA)
)
Alpha nodes)
– NCSA – SDSC – Argonne National Laboratory (ANL) – Caltech
(10 Gbit between every pair of sites)
– Provide New Capabilities through:
– To enable:
a grid
– Design principles assume heterogeneity and > 4 sites
– multiple types, with smaller number of “tightly coupled” and large number of “loosely coupled”
– Formally documented design: protocols and specifications
– Support evolutionary path
– Provide exam ples, tools, training to exploit grid capabilities – User support, user support, and user support
– open source software and community
– “McKinley” processors for commodity leverage
– bandwidth for rich interaction and tight coupling
– hundreds of terabytes for secondary storage
– Globus, data management, …
– breakthrough versions of today’s applications – But also, reaching beyond “traditional” supercomputing
HPSS HPSS 574p IA-32 Chiba City 128p Origin HR Display & VR Facilities Myrinet Myrinet Myrinet Myrinet 1176p IBM SP Blue Horizon Sun E10K 1500p Origin UniTree 1024p IA-32 320p IA-64 HPSS 256p HP X-Class 128p HP V2500 92p IA-32
NCSA: Compute-Intensive ANL: Visualization Caltech: Data collection analysis SDSC: Data-Intensive
– Alpha-based cluster at PSC – Power4-based cluster at SDSC
Myrinet Myrinet Myrinet Myrinet
Chicago & LA DTF Core Switch/Routers
Sun Server
Federation
7.8 TF Power4 1 TF Itanium2
Fibre Channel Fibre Channel
2 TF Itanium2 9.2 TF Madison
0.5 TF Itanium2 90TB
1.5 TF Itanium2/Madison 20 TB
Datawulf IA-32
SDSC NCSA Caltech Argonne
Quadrics
PSC
6TF Alpha EV68 1.1 TF Alpha EV7
300 TB 300 TB 160 TB
Myrinet Myrinet Myrinet Myrinet
Chicago & LA DTF Core Switch/Routers
Sun Server
Federation
7.8 TF Power4 1 TF Itanium2
Fibre Channel Fibre Channel
2 TF Itanium2 9.2 TF Madison
0.5 TF Itanium2 90TB
1.5 TF Itanium2/Madison 20 TB
Datawulf IA-32
SDSC NCSA Caltech Argonne
Quadrics
PSC
6TF Alpha EV68 1.1 TF Alpha EV7
300 TB 300 TB 160 TB
– Networking – Clusters – Performance evaluation – Etc. etc. etc …
Site Coordination Committee
Site Leads Project Director
Rick Stevens (UC/ANL)
Technical Coordination Committee
Project-wide Technical Area Leads Clusters
Pennington (NCSA)
Networking
Winkler (ANL)
Grid Software
Kesselman (ISI) Butler (NCSA)
Data
Baru (SDSC)
Applications
WIlliams (Caltech)
Visualization
Papka (ANL)
Performance Eval
Brunett (Caltech)
…
Chief Architect
Dan Reed (NCSA)
Executive Committee
Fran Berman, SDSC (Chair) Ian Foster, UC/ANL Paul Messina, CIT Dan Reed, NCSA Rick Stevens, UC/ANL Charlie Catlett, ANL
Technical Working Group
cyberinfrastructure?
External Advisory Committee
External Advisory Committee
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
User Advisory Committee
good science? NSF MRE Projects Internet-2
McRobbie
Alliance UAC
Sugar, Chair
NPACI UAC
Kupperman, Chair
NSF ACIR NSF ACIR NSF Review Panels NSF Review Panels
Policy Oversight Policy Oversight Objectives Architecture
Currently being formed
Executive Director / Project Manager
Charlie Catlett (UC/ANL)
ANL
Evard
CIT
Bartelt
NCSA
Pennington
SDSC
Andrews
PSC NCAR Operations
Sherwin (SDSC)
User Services
Wilkins-Diehr (SDSC) Towns (NCSA)
Implementation
Site Coordination Committee
Site Leads Project Director
Rick Stevens (UC/ANL)
Technical Coordination Committee
Project-wide Technical Area Leads Clusters
Pennington (NCSA)
Networking
Winkler (ANL)
Grid Software
Kesselman (ISI) Butler (NCSA)
Data
Baru (SDSC)
Applications
WIlliams (Caltech)
Visualization
Papka (ANL)
Performance Eval
Brunett (Caltech)
…
Chief Architect
Dan Reed (NCSA)
Executive Committee
Fran Berman, SDSC (Chair) Ian Foster, UC/ANL Paul Messina, CIT Dan Reed, NCSA Rick Stevens, UC/ANL Charlie Catlett, ANL
Technical Working Group
cyberinfrastructure?
External Advisory Committee
External Advisory Committee
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
User Advisory Committee
good science? NSF MRE Projects Internet-2
McRobbie
Alliance UAC
Sugar, Chair
NPACI UAC
Kupperman, Chair
NSF ACIR NSF ACIR NSF Review Panels NSF Review Panels
Policy Oversight Policy Oversight Objectives Architecture
Currently being formed
Executive Director / Project Manager
Charlie Catlett (UC/ANL)
ANL
Evard
CIT
Bartelt
NCSA
Pennington
SDSC
Andrews
PSC NCAR Operations
Sherwin (SDSC)
User Services
Wilkins-Diehr (SDSC) Towns (NCSA)
Implementation
Site Coordination Committee
Site Leads Project Director
Rick Stevens (UC/ANL)
Technical Coordination Committee
Project-wide Technical Area Leads Clusters
Pennington (NCSA)
Networking
Winkler (ANL)
Grid Software
Kesselman (ISI) Butler (NCSA)
Data
Baru (SDSC)
Applications
WIlliams (Caltech)
Visualization
Papka (ANL)
Performance Eval
Brunett (Caltech)
…
Chief Architect
Dan Reed (NCSA)
Executive Committee
Fran Berman, SDSC (Chair) Ian Foster, UC/ANL Paul Messina, CIT Dan Reed, NCSA Rick Stevens, UC/ANL Charlie Catlett, ANL
Technical Working Group
cyberinfrastructure?
External Advisory Committee
External Advisory Committee
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
Institutional Oversight Committee
Robert Conn, UCSD Richard Herman UIUC Dan Meiron, CIT (Chair) Robert Zimmer, UC/ANL
User Advisory Committee
good science? NSF MRE Projects Internet-2
McRobbie
Alliance UAC
Sugar, Chair
NPACI UAC
Kupperman, Chair
NSF ACIR NSF ACIR NSF Review Panels NSF Review Panels
Policy Oversight Policy Oversight Objectives Architecture
Currently being formed
Executive Director / Project Manager
Charlie Catlett (UC/ANL)
ANL
Evard
CIT
Bartelt
NCSA
Pennington
SDSC
Andrews
PSC NCAR Operations
Sherwin (SDSC)
User Services
Wilkins-Diehr (SDSC) Towns (NCSA)
Implementation
– Sep 2002: Initial delivery of phase-1 systems – Mar 2003: Friendly users – July 2003: Production
– This should have come as a surprise?
– This should have come as a surprise?
– Bug in floating-point software assist code.
– This should have come as a surprise?
– Bug in floating-point software assist code. – Kernel bug: Floating-point registers sometimes not being saved/ restored properly on context switch.
– This should have come as a surprise?
– Bug in floating-point software assist code. – Kernel bug: Floating-point registers sometimes not being saved/ restored properly on context switch. – SUPERB support from IBM + Intel on first problem, and from IBM + SuSE on second.
– Problem with Itanium2 processor (hit the trade press just yesterday) – { Intel| IBM} working on a solution
– Even at 10 Gbit/ second, 100 TBytes takes approximately one day to move – Possible solutions: Datacutter and similar tools