SPARC 2 Consultations January-February 2016 1 Outline - - PowerPoint PPT Presentation

sparc 2 consultations
SMART_READER_LITE
LIVE PREVIEW

SPARC 2 Consultations January-February 2016 1 Outline - - PowerPoint PPT Presentation

SPARC 2 Consultations January-February 2016 1 Outline Introduction to Compute Canada SPARC 2 Consultation Context Capital Deployment Plan Services Plan Access and Allocation Policies (RAC, etc.) Discussion 2


slide-1
SLIDE 1

SPARC 2 Consultations

January-February 2016

1

slide-2
SLIDE 2

Outline

  • Introduction to Compute Canada
  • SPARC 2 Consultation Context
  • Capital Deployment Plan
  • Services Plan
  • Access and Allocation Policies (RAC, etc.)
  • Discussion

2

slide-3
SLIDE 3

Introduction to Compute Canada

3

slide-4
SLIDE 4

Compute Canada (CC)

An Effective Provider of Essential Digital Research Infrastructure

Compute Canada, working through a federated partnership with regional

  • rganizations ACENET, Calcul Québec, Compute Ontario and WestGrid,

leads the acceleration of research and innovation by deploying advanced research computing (ARC) systems, storage and software solutions. CC is a not-for-profit corporation. The membership includes most of Canada’s major research universities. CC acts as a steward of Canada’s ARC platform:

  • Compute and storage resources, data centres
  • Team of ~200 experts in utilization of ARC for research
  • 100s of research software packages
  • Cloud compute and storage (openstack, owncloud)
  • National services

CC is a proud ambassador for Canadian excellence in advanced research computing nationally and internationally.

4

slide-5
SLIDE 5

Canada’s ARC Platform Today & Tomorrow

A Distributed Partnership Distributed Across Canada today

50 Systems 27 Data Centres 200,000 cores, 2 Pflops, 20 PB 200 Experts

Consolidation & Concentration by 2018

5-10 Data Centres 300,000 cores, 12 Pflops, 50+ PB (Challenge 2) 200 Experts

Continued Investment Required For Canadian Science to Compete Globally

CANARIE and regional Networks

5

Services

slide-6
SLIDE 6

Member locations and new national hosting sites

slide-7
SLIDE 7

Services Too...

7

slide-8
SLIDE 8

8

Access and Allocations

  • All Canadian faculty members have access to Compute Canada systems

and can sponsor others in their name.

  • Each system has resources set aside for users with “default priority”. No

special vetting or application process required.

  • Researchers with larger needs can apply to two different resource

allocation competitions: ○ RAC: 1-year, mostly individual faculty members ○ RPP: up to 3-years, platforms and portals, shared datasets

  • Storage is a dedicated allocation. Compute is a priority allocation.
  • Allocation decisions made based on peer review.
slide-9
SLIDE 9

9

Serving Researchers in all Disciplines

slide-10
SLIDE 10

10

The Funding Model

Figures for 2014-2015 Roughly $30M/year operating in 2014/15 Partner funding model ensures alignment of objectives. Capital and operating funded with the same model:

  • 40% funded through the Canada Foundation for Innovation (MSI

programme for operations, Cyberinfrastructure for capital)

  • 60% from Universities, Provinces, other sources

National leadership ensures strategic focus and accountability

slide-11
SLIDE 11

SPARC 2 Consultation Context

11

slide-12
SLIDE 12

Current Status - New Systems Coming

  • Compute Canada received good news from CFI in July 2015.

$30M in new infrastructure investments ($75M total project cost)!

  • Some RFPs are already issued, new equipment is coming. New major

systems to be deployed this year.

  • However, many existing systems nearing (or past!) end-of-life.
  • 2016-17 is about commissioning new systems while

decommissioning old systems.

  • Systems will be more powerful, # cores will not rise significantly.
  • Storage capacity will increase dramatically.

12

slide-13
SLIDE 13

Current Status - Times are Tight

  • Demand continues to grow. 2016 competition just completed:

○ 366 applications ○ 16% increase in CPU ask (after correction) ○ 34% increase in storage ask (after correction) ○ 123% increase in GPU ask (after correction)

  • New storage is coming soon, granted some delayed allocations.
  • 42 projects (13%) that requested compute allocations were not awarded any

compute allocation. 4% last year. (note: all are funded researchers)

  • Average award:

○ 57% of compute request (65% last year, 84% in 2012) ○ 82% of storage request ○ 19% of GPU request

  • The 2017 competition will also be tough.

13

slide-14
SLIDE 14

Funding Opportunities - 2016 and beyond

  • Operating:

○ Current operations funding (CFI MSI) expires March 31, 2017 ○ CC (through Western University) has submitted an NOI for the next competition 2017-2022. ○ Full CC MSI proposal due May 20, 2016

  • Capital:

○ Currently purchasing infrastructure through CFI Cyberinfrastructure Initiative - Challenge-2, Stage-1. Expect to be fully deployed by end of 2017. ○ Expect to be given opportunity to apply for additional capital funds in conjunction with MSI renewal proposal - May 20, 2016. ○ Expect additional capital funding opportunities in connection with mid-term report on next MSI (likely required by spring 2020)

The next 3-4 months are critical for planning Canada’s ARC future through 2022!

14

slide-15
SLIDE 15

Ways to Provide Feedback

www.computecanada.ca/sparc2/

  • In person:

○ Speak up in this meeting! ○ Virtual - video conferenced consultations (Feb. 3, 22 in English)

  • Via a White Paper
  • Via a brief (5 minute) survey:

○ www.surveymonkey.com/r/V59ZDGV

  • Via email (any time):

○ sparc@computecanada.ca Note: 2014 White Paper responses from 20+ disciplinary organizations, universities and individuals had a strong influence on current technology plan.

15

slide-16
SLIDE 16

White Papers

  • Updates to 2014 SPARC v1 White Papers welcome!
  • Introduction to your disciplinary use of ARC
  • Status quo for utilization of current resources
  • What challenges have you encountered with your use of the ARC that

Compute Canada provides?

  • What are your anticipated resource needs into the future (ideally, through

2022): ○ Computation ○ Storage ○ Services ○ Support

  • What are some of the new technologies, services, support, etc., that you would

like Compute Canada to investigate or provide? On what timeline?

16

slide-17
SLIDE 17

White Papers - Guide Included on Website

17

slide-18
SLIDE 18

SPARC Survey

www.surveymonkey. com/r/V59ZDGV

18

slide-19
SLIDE 19

Technology Deployment Plan

19

slide-20
SLIDE 20

Capital Planning Timeline

  • CFI Challenge-2 Stage-1 (announced)

○ $30M CFI investment announced, July 2015 ○ 2015: National Data Infrastructure RFP launched; deployment in 2016 ○ 2016: 3 new systems to be deployed ○ 2017: 1 new system to be deployed, potentially 2 systems upgraded ○ April 1, 2018 - spending complete

  • CFI Challenge-2 Stage-2 (assumed for planning purposes)

○ Deadline May 20, 2016. Decision September 2016 ○ Site selection process underway now. ○ 2017: first purchases ○ April 1, 2020 - spending complete

  • CFI Challenge-2, Stage-3 (assumed for planning purposes)

○ Coincident with MSI mid-term review - 2019/2020 ○ First spend in 2020/2021 (roughly replacement timeline for stage-1 purchases)

20

slide-21
SLIDE 21

Capital Deployment Plan 2016-17

www.computecanada.ca/wp-content/uploads/2015/11/Compute- Canada-Technology-Briefing-2015.pdf

21

  • CC submitted a capital proposal to CFI in April 2015, including an

investment plan for four national sites.

  • Key components:

○ Addresses pressing and urgent needs as older systems are defunded ○ Concentrated investment in 4 large sites, national procurement process ○ National Storage Architecture (60+PB of new storage) ○ Greatly expanded cloud (OpenStack) capacity ○ Greatly expanded accelerator (GPU) capacity ○ Some heterogeneous systems with large memory (1TB+) nodes Note: In parallel, CFI has run a Challenge-1 competition. The investments in the CC capital deployment plan include infrastructure and tool development designed to support those projects.

slide-22
SLIDE 22

Capital Deployment Plan 2016-17

www.computecanada.ca/wp-content/uploads/2015/11/Compute- Canada-Technology-Briefing-2015.pdf

22

Note: over the same time period we will be decommissioning an existing 82,000 CPU cores and a large fraction of existing disk storage.

slide-23
SLIDE 23

Capital Deployment Plan 2016-17

www.computecanada.ca/wp-content/uploads/2015/11/Compute- Canada-Technology-Briefing-2015.pdf

23

slide-24
SLIDE 24

Capital Plan 2017-19 (Stage 2)

24

  • The capital plan for Stage 2 will be built between now and May 20, 2016.
  • CFI expected to require CC to propose 3 different technology options,

with science justifications for each.

  • Expectations:

○ Addition of 1-3 new national sites ○ Expansion of some existing national sites ○ Expansion of national storage infrastructure

  • Decisions need to be made:

○ Balance of Large Parallel, General Purpose and Cloud? ○ Emphasis on new architectures? ○ Emphasis on accelerators? ○ Memory per node? ○ Services - Databases, storage platforms, private networks?

slide-25
SLIDE 25

Services Plan

25

slide-26
SLIDE 26

Compute Canada Services - Middleware

26

  • We are service providers, not just infrastructure providers.
  • The CC user base is broadening, bringing a broader set of needs.
  • We have seen tremendous interest in services enabling Research Data

Management (RDM)

  • Through Challenge-1 and our Research Platforms and Portals

competitions we have identified an additional list of middleware services CC will implement in common across our sites: ○ Authentication and ID Management ○ Data Transfer ○ Software Distribution ○ Monitoring (system status) ○ Resource publishing (capacity available)

slide-27
SLIDE 27

CC Services - Disciplinary Support

27

  • Compute Canada expert research support is built around excellent local

services - experts on your campus.

  • In 2015 we augmented this through creation of our first national

disciplinary support team - in digital humanities.

  • Disciplinary support teams:

○ encourage sharing of best practices across the country ○ work on discipline specific documentation ○ perform outreach to Canadian practitioners ○ identify weakness in the support model or infrastructure plan with respect to each disciplinary group

  • We are happy to take feedback on where you think more support is

needed: ○ Should we create a new team in a certain area? ○ Should the list of responsibilities above (per team) be expanded?

slide-28
SLIDE 28

CC Services - Research Support

28

  • Currently, expert support is generally:

○ local (on campus) ○ short-term (days, not months)

  • We get requests for long term (embedded) research support.
  • Currently offered on a competitive basis in some regions but not a

national service.

  • Should we offer embedded (long term) support? On what basis? Paid?

Competitive?

slide-29
SLIDE 29

CC Services - Training

29

  • Compute Canada current offers training across the country:

○ Code optimization ○ Use of specific hardware platforms or software services ○ Basic and advanced HPC techniques

  • Most training is local/regional. Local courses offered by local staff.
  • National initiatives include:

○ National partnership with Software Carpentry ○ International partner in International HPC Summer School ○ Discussions with Data Carpentry

  • We welcome feedback on training emphasis. Where are the gaps

today?

slide-30
SLIDE 30

CC Services - Security and Privacy

30

  • More and more ARC is being used to do research involving

personal info (e.g., health, social sciences, industry data).

  • Policies must be in place to protect personal information.
  • Physical and network security must be in place to protect held on CC

systems.

  • Data isolation has to be assured for special projects that require it.
  • CC has adopted a new security framework - the ISMS follows ISO/IEC

27001 (operations and standards ISO/IEC 27002).

  • Minimum standard in all CC data centres, some will be designated for

higher security data sets.

  • New network, storage design to support data isolation.
slide-31
SLIDE 31

Access and Allocation Policies

31

slide-32
SLIDE 32

CC Access Policy

32

  • The current access policy is organized by “sponsor.” CC approves the

sponsor, the sponsor approves any and all group members.

  • Group members can be students, postdocs, external collaborators, etc.

All usage “charged” to the sponsor.

  • There is no fee for usage charged to Canadian university faculty.
  • When the sponsor is from private industry, all usage is subject to a fee.
  • When the sponsor is from a federal laboratory or other not-for-profit, a

reduced fee applies.

  • Teaching is not an eligible use (though training is).
  • Has this policy ever been an impediment to your research? Suggestions?
slide-33
SLIDE 33

CC Resource Allocation Policies

33

  • All users have “default access” to each CC system (compute,

storage).

  • Users can apply for special resource allocations for:

○ Compute (priority in shared system, in core-years) ○ Storage (dedicated, short-term or long term) ○ Cloud resources (virtual machines, public IP addresses, etc.)

  • CC allocates about 80% of the available core-years each year through

competitive processes. This leaves up to 20% for default access.

  • Two categories of competition, one competition period per year:

○ RAC: generally single investigator projects ○ Research Platforms and Portals (RPP): shared datasets, possible multi-year allocations

slide-34
SLIDE 34

CC Resource Allocation Policies

34

  • Competition is based on peer review:

○ Technical review to correct “asks” ○ 7 disciplinary panels (78 panelists this year) ○ multiple independent reviews per proposal ○ panel review meeting to set science score ○ multidisciplinary panel review of (about 30) largest proposals

  • If panel process does not result in a “balanced budget”, CC applies

scaling function based on science score from panel process. 2016 example (compute):

Default Priority

slide-35
SLIDE 35

CC Resource Allocation Questions

35

  • Competition frequency: once per year plus ad-hoc out-of-round enough?
  • Award duration: single year with fast track long enough? Note that CC

must report every year, so progress report always needed.

  • CCV introduced for 2016 competition. How can we improve the CCV

experience?

  • Compute scaling based on science score. Alternatives: rank-and-cut,

different function shape?

  • The connection between tri-council research grants and CC resource

allocations means that successful grant recipients still need to apply for Compute Canada resources. Double jeopardy unavoidable?

  • We are sometimes asked if users can contribute additional resources. Is

there significant demand to provide a price-list?

slide-36
SLIDE 36

Ways to Provide Feedback

www.computecanada.ca/research-portal/feedback/sparc2/

  • In person:

○ Speak up in this meeting! ○ Virtual - video conferenced consultations (Feb. 3, 22 in english)

  • Via a White Paper
  • Via a brief (5 minute) survey:

○ www.surveymonkey.com/r/V59ZDGV

  • Via email (any time):

○ sparc@computecanada.ca An Aside: Account renewal is coming in March. Intend to collect CCVs.

36

slide-37
SLIDE 37

Thank You!

37

slide-38
SLIDE 38

At the Limit of Our Capacity

38

slide-39
SLIDE 39

Projecting Increased Compute Demand

Based on SPARC whitepaper projections (research roadmaps)

White Paper Predicted Increase from Current to 2020 Numerical Relativity 3x Subatomic Physics 3x Materials Research 5x Canadian Genome Centres 8x Canadian Astronomical Society 10x Theoretical Chemistry 12x

  • Also projected:

○ Clear need for accelerators. ○ Clear need for mix of memory sizes.

39

slide-40
SLIDE 40

Projecting Compute Demand: 7x / 5 years

Averaging SPARC whitepaper projections (research roadmaps)

40

slide-41
SLIDE 41

Projecting Storage Demand: 15x / 5 years

Averaging SPARC whitepaper projections (research roadmaps)

41