Kate Keahey Computation Institute, University of Chicago Argonne - - PowerPoint PPT Presentation

▶

Aug 20, 2022 159 likes •246 views

www. chameleoncloud.org CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE Kate Keahey Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov 1 APRIL 23, 2018 INTRODUCING

SLIDE 1

www. chameleoncloud.org

APRIL 23, 2018

CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE

Kate Keahey

Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov

SLIDE 2

www. chameleoncloud.org

INTRODUCING CHAMELEON

Deeply Reconfigurable

Instrument for Computer Science Research Support for isolation, bare metal reconfiguration, custom kernel reboot,

console access, etc.

Large-scale Experimental Infrastructure

Total of ~650 nodes (~14,500 cores), 5 PB of storage distributed over 2 sites

connected with 100G network

Large-scale homogenous partition Heterogeneous hardware: Infiniband, FPGAs, GPUs, ARMs, Atoms, etc. Support for large-scale in capabilities and policies

Developed primarily on top of commodity open source system

Leverages community investment in the project Contributes to development: revival of Blazar, contributions to Ironic, Nova,

implementation of snapshotting, dynamic VLANs, etc.

Interacts with the community via the scientific working group

www.chameleoncloud.org

SLIDE 3

www. chameleoncloud.org

CHAMELEON HARDWARE

SCUs connect to core and fully connected to each other

Heterogeneous Cloud Units

Alternate Processors and Networks

Switch

Standard Cloud Unit

42 compute 4 storage

x10

Chicago

To UTSA, GENI, Future Partners

Austin

Chameleon Core Network

100Gbps uplink public network (each site)

Core Services

3.6 PB Central File Systems, Front End and Data Movers

Core Services

Front End and Data Mover Nodes 504 x86 Compute Servers 48 Dist. Storage Servers 102 Heterogeneous Servers 16 Mgt and Storage Nodes

Switch

Standard Cloud Unit

42 compute 4 storage

SLIDE 4

www. chameleoncloud.org

EXPERIMENTAL WORKFLOW REQUIREMENTS

discover resources provision resources configure and interact monitor

Fine-grained
Complete
Up-to-date
Versioned
Verifiable
Advance

reservations &

n-demand
Isolation
Fine-grained

allocations

Deeply

reconfigurable

Appliance

catalog

Snapshotting
Complex

Appliances

Network

Isolation

Hardware

metrics

Fine-grained

information

Aggregate and

CHAMELEON IMPLEMENTATION

Appliance Catalog Allocation Management Grid’5000 Resource Discovery Blazar Ironic Ceilometer Keystone Web portal Horizon TAS (TACC) Request Tracker Nova Neutro n Swift Glance Heat Chameleon Instance

Utilities Agents Clients

Monitoring Services Configuration Services Resource Management Services Discovery Services Appliance Catalog User Services

SLIDE 6

www. chameleoncloud.org

CHALLENGES AND LESSONS LEARNED

Building on top of a commodity open source project

Significant advantages in terms of direct and indirect community investment Advantages for long-term maintenance

We need more than a testbed to support CS research

Traces and workloads, research data Tools for repeatability

New concept: myths and misperceptions

Not true: “only available to users with NSF allocation” Not true: “they use OpenStack so it is VMs” Not true: “can’t get an experiment with 100s of nodes”

Managing incentives

Balancing individual versus community needs: allocations and lease limits Resource scarcity

SLIDE 7

www. chameleoncloud.org

DISCUSSION

Focus on specific users/scenarios/benefits

Specific benefits drive involvement

Balancing incentives to federate

Science has no borders But: resources are finite, and stakeholder interests

need to be respected

Building on common needs

Common data sharing services? Traces and workloads

SLIDE 8

www. chameleoncloud.org

APRIL 23, 2018

Kate Keahey Computation Institute, University of Chicago Argonne - - PowerPoint PPT Presentation

CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE

Kate Keahey

Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov

INTRODUCING CHAMELEON

console access, etc.

connected with 100G network

implementation of snapshotting, dynamic VLANs, etc.

CHAMELEON HARDWARE

Switch

Chameleon Core Network

Core Services

Switch

EXPERIMENTAL WORKFLOW REQUIREMENTS

discover resources provision resources configure and interact monitor

reservations &

allocations

reconfigurable

catalog

Appliances

Isolation

metrics

information

archive

CHAMELEON IMPLEMENTATION

CHALLENGES AND LESSONS LEARNED

DISCUSSION

Focus on specific users/scenarios/benefits

Specific benefits drive involvement

Balancing incentives to federate

Science has no borders But: resources are finite, and stakeholder interests

need to be respected

Building on common needs

Common data sharing services? Traces and workloads

www.chameleoncloud.org keahey@anl.gov

CHAMELEON: A LARGE SCALE, RECONFIGURABLE EXPERIMENTAL INSTRUMENT FOR COMPUTER SCIENCE

Kate Keahey

Computation Institute, University of Chicago Argonne National Laboratory keahey@anl.gov

INTRODUCING CHAMELEON

console access, etc.

connected with 100G network

implementation of snapshotting, dynamic VLANs, etc.

CHAMELEON HARDWARE

Switch

Chameleon Core Network

Core Services

Switch

EXPERIMENTAL WORKFLOW REQUIREMENTS

discover resources provision resources configure and interact monitor

reservations &

allocations

reconfigurable

catalog

Appliances

Isolation

metrics

information

archive

CHAMELEON IMPLEMENTATION

CHALLENGES AND LESSONS LEARNED

DISCUSSION

Focus on specific users/scenarios/benefits

Specific benefits drive involvement

Balancing incentives to federate

Science has no borders But: resources are finite, and stakeholder interests

need to be respected

Building on common needs

Common data sharing services? Traces and workloads

www.chameleoncloud.org keahey@anl.gov

Focus on specific users/scenarios/benefits

Specific benefits drive involvement

Balancing incentives to federate

Science has no borders But: resources are finite, and stakeholder interests

Building on common needs

Common data sharing services? Traces and workloads