Kate Keahey Computation Institute, University of Chicago Argonne - - PowerPoint PPT Presentation

kate keahey
SMART_READER_LITE
LIVE PREVIEW

Kate Keahey Computation Institute, University of Chicago Argonne - - PowerPoint PPT Presentation

www. chameleoncloud.org CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE Kate Keahey Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov 1 APRIL 23, 2018 INTRODUCING


slide-1
SLIDE 1
  • www. chameleoncloud.org

APRIL 23, 2018

1

CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE

Kate Keahey

Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov

slide-2
SLIDE 2
  • www. chameleoncloud.org

INTRODUCING CHAMELEON

„ Deeply Reconfigurable

„ Instrument for Computer Science Research „ Support for isolation, bare metal reconfiguration, custom kernel reboot,

console access, etc.

„ Large-scale Experimental Infrastructure

„ Total of ~650 nodes (~14,500 cores), 5 PB of storage distributed over 2 sites

connected with 100G network

„ Large-scale homogenous partition „ Heterogeneous hardware: Infiniband, FPGAs, GPUs, ARMs, Atoms, etc. „ Support for large-scale in capabilities and policies

„ Developed primarily on top of commodity open source system

„ Leverages community investment in the project „ Contributes to development: revival of Blazar, contributions to Ironic, Nova,

implementation of snapshotting, dynamic VLANs, etc.

„ Interacts with the community via the scientific working group

„ www.chameleoncloud.org

slide-3
SLIDE 3
  • www. chameleoncloud.org

CHAMELEON HARDWARE

SCUs connect to core and fully connected to each other

Heterogeneous Cloud Units

Alternate Processors and Networks

Switch

Standard Cloud Unit

42 compute 4 storage

x10

Chicago

To UTSA, GENI, Future Partners

Austin

Chameleon Core Network

100Gbps uplink public network (each site)

Core Services

3.6 PB Central File Systems, Front End and Data Movers

Core Services

Front End and Data Mover Nodes 504 x86 Compute Servers 48 Dist. Storage Servers 102 Heterogeneous Servers 16 Mgt and Storage Nodes

Switch

Standard Cloud Unit

42 compute 4 storage

x2

slide-4
SLIDE 4
  • www. chameleoncloud.org

EXPERIMENTAL WORKFLOW REQUIREMENTS

discover resources provision resources configure and interact monitor

  • Fine-grained
  • Complete
  • Up-to-date
  • Versioned
  • Verifiable
  • Advance

reservations &

  • n-demand
  • Isolation
  • Fine-grained

allocations

  • Deeply

reconfigurable

  • Appliance

catalog

  • Snapshotting
  • Complex

Appliances

  • Network

Isolation

  • Hardware

metrics

  • Fine-grained

information

  • Aggregate and

archive

slide-5
SLIDE 5
  • www. chameleoncloud.org

CHAMELEON IMPLEMENTATION

Appliance Catalog Allocation Management Grid’5000 Resource Discovery Blazar Ironic Ceilometer Keystone Web portal Horizon TAS (TACC) Request Tracker Nova Neutro n Swift Glance Heat Chameleon Instance

Utilities Agents Clients

Monitoring Services Configuration Services Resource Management Services Discovery Services Appliance Catalog User Services

slide-6
SLIDE 6
  • www. chameleoncloud.org

CHALLENGES AND LESSONS LEARNED

„ Building on top of a commodity open source project

„ Significant advantages in terms of direct and indirect community investment „ Advantages for long-term maintenance

„ We need more than a testbed to support CS research

„ Traces and workloads, research data „ Tools for repeatability

„ New concept: myths and misperceptions

„ Not true: “only available to users with NSF allocation” „ Not true: “they use OpenStack so it is VMs” „ Not true: “can’t get an experiment with 100s of nodes”

„ Managing incentives

„ Balancing individual versus community needs: allocations and lease limits „ Resource scarcity

slide-7
SLIDE 7
  • www. chameleoncloud.org

DISCUSSION

„Focus on specific users/scenarios/benefits

„Specific benefits drive involvement

„Balancing incentives to federate

„Science has no borders „But: resources are finite, and stakeholder interests

need to be respected

„Building on common needs

„Common data sharing services? „Traces and workloads

slide-8
SLIDE 8
  • www. chameleoncloud.org

APRIL 23, 2018

8

www.chameleoncloud.org keahey@anl.gov