The Scalable Commutativity Rule: Designing Scalable Software for - PowerPoint PPT Presentation

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors Austin T. Clements Thesis advisors: M. Frans Kaashoek Nickolai Zeldovich Robert Morris Eddie Kohler

x86 CPU trends

x86 CPU trends 2005

x86 CPU trends 100,000 Clock speed (MHz) 10,000 1,000 100 10 1 1985 1990 1995 2000 2005 2010 2015 Sources: Stanford CPUDB, Intel ARK

x86 CPU trends 100,000 Clock speed (MHz) Power (watts) 10,000 1,000 100 10 1 1985 1990 1995 2000 2005 2010 2015 Sources: Stanford CPUDB, Intel ARK

x86 CPU trends 100,000 Clock speed (MHz) Power (watts) 10,000 Cores per socket 1,000 100 10 1 1985 1990 1995 2000 2005 2010 2015 Sources: Stanford CPUDB, Intel ARK

x86 CPU trends 100,000 Clock speed (MHz) Power (watts) 10,000 Cores per socket Total megacycles/sec 1,000 100 10 1 1985 1990 1995 2000 2005 2010 2015 Sources: Stanford CPUDB, Intel ARK

Parallelize or perish Software must be increasingly parallel to keep up with hardware, but scaling with parallelism is notoriously hard

Parallelize or perish Software must be increasingly parallel to keep up with hardware, but scaling with parallelism is notoriously hard Exim mail server 10k 8k Messages/second 6k 4k 2k 0 1 6 12 18 24 30 36 42 48 Cores

Parallelize or perish Software must be increasingly parallel to keep up with hardware, but scaling with parallelism is notoriously hard Exim mail server 10k 8k Messages/second 6k 4k 2k 0 1 6 12 18 24 30 36 42 48 Cores Problem lies in the OS kernel

OS kernel scalability Kernel scalability is important • Many applications depend on the OS kernel • If the kernel doesn't scale, many applications won't scale And hard • |kernel threads| > ∑ |application threads| • Diverse and unknown workloads

Current approach to scalable software development 2008 Corey 2009 OSDI '08 2010 Linux scalability OSDI '10 2011 Bonsai VM 2012 ASPLOS '12 2013 RadixVM EuroSys '13 2014

Current approach to scalable software development 2008 Corey 2009 OSDI '08 2010 Linux scalability OSDI '10 Workload 2011 Bonsai VM 2012 ASPLOS '12 2013 RadixVM EuroSys '13 2014

Current approach to scalable software development 2008 Corey 2009 OSDI '08 2010 Linux scalability OSDI '10 Plot Workload 2011 scalability Bonsai VM 2012 ASPLOS '12 2013 RadixVM EuroSys '13 2014

Current approach to scalable software development 2008 Corey 2009 OSDI '08 Di ff erential x() pro fi le 2010 Linux scalability OSDI '10 Plot Workload 2011 scalability Bonsai VM 2012 ASPLOS '12 2013 RadixVM EuroSys '13 2014

Current approach to scalable software development 2008 Corey 2009 OSDI '08 Di ff erential x() pro fi le 2010 Linux scalability OSDI '10 Plot Workload 2011 scalability Bonsai VM 2012 ASPLOS '12 Fix top +++ bottleneck 2013 RadixVM EuroSys '13 2014

Current approach to scalable software development Successful in practice because it focuses developer e ff ort Disadvantages • Requires huge amounts of e ff ort • New workloads expose new bottlenecks • More cores expose new bottlenecks • The real bottlenecks may be in the interface design

Interface scalability example creat("x") creat("y") creat("z")

Interface scalability example creat("x") creat("y") creat("z") stdin stdout stderr

Interface scalability example creat("x") creat("y") creat("z") stdin stdout stderr Solution: Change the interface?

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales.

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales. Scalable implementation Commutes exists ? creat with lowest FD

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales. Scalable implementation Commutes exists ? creat with lowest FD creat → 3 creat → 4

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales. Scalable implementation Commutes exists ✗ creat with lowest FD

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales. Scalable implementation Commutes exists ✗ creat with lowest FD ? creat with any FD creat → 42 creat → 17

Approach: Interface-driven scalability The scalable commutativity rule Whenever interface operations commute, they can be implemented in a way that scales. Scalable implementation Commutes exists ✗ creat with lowest FD rule creat with any FD ✓ ✓

Advantages of interface-driven scalability The rule enables reasoning about scalability throughout the software design process Design Guides design of scalable interfaces Implement Sets a clear implementation target Test Systematic, workload-independent scalability testing

Contributions The scalable commutativity rule • Formalization of the rule and proof of its correctness • State-dependent, interface-based commutativity Commuter: An automated scalability testing tool sv6: A scalable POSIX-like kernel

Outline De fi ning the rule • De fi nition of scalability • Intuition • Formalization Applying the rule • Commuter • Evaluation

A scalability bottleneck 40 gmake Exim 35 Normalized throughput 30 25 20 15 10 5 0 1 6 12 18 24 30 36 42 48 Cores

A scalability bottleneck 40 gmake Exim 35 Normalized throughput 30 25 20 15 10 5 0 1 6 12 18 24 30 36 42 48 Cores One contended cache line A single contended cache line can wreck scalability

Cost of a contended cache line 3.5k 3k 2.5k Cycles to read 2k 1.5k 1k 500 0 1 10 20 30 40 50 60 70 80 1 writer + N readers

Cost of a contended cache line 3.5k 3k 2.5k Cycles to read 2k 1.5k open 1k 500 0 1 10 20 30 40 50 60 70 80 1 writer + N readers

What scales on today's multicores? Core X W R - W ✗ ✗ ✓ Core Y R ✗ ✓ ✓ - - ✓ ✓

What scales on today's multicores? Core X W R - W ✗ ✗ ✓ Core Y R ✗ ✓ ✓ ✓ - - ✓ ✓

What scales on today's multicores? Core X W R - W ✗ ✗ ✓ Core Y R ✗ ✗ ✓ ✓ - - ✓ ✓

What scales on today's multicores? Core X W R - W ✗ ✗ ✓ Core Y R ✗ ✓ ✓ - - ✓ ✓ We say two or more operations are scalable if they are con fl ict-free . Good approximation of current hardware.

The intuition behind the rule Whenever interface operations commute, they can be implemented in a way that scales. Operations commute results independent of order ⇒ communication is unnecessary ⇒ without communication, no con fl icts ⇒

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 dec() → 1 T4 T5 dec() → 0

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 dec() → 1 T4 T5 dec() → 0 R1

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 dec() → 1 T4 T5 dec() → 0 R1 ✓ R1 commutes; con fl ict-free implementation: shared counter

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 dec() → 1 T4 T5 dec() → 0 R1 R2 ✓ R1 commutes; con fl ict-free implementation: shared counter

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 dec() → 1 T4 T5 dec() → 0 R1 R2 ✓ R1 commutes; con fl ict-free implementation: shared counter ✗ R2 does not commute because dec() returns counter value

Example: Reference counter T1 iszero() → F T2 iszero() → F T3 dec() → 2 ok dec() → 1 ok T4 T5 dec() → 0 ok R1 R2' ✓ R1 commutes; con fl ict-free implementation: shared counter ✗ R2 does not commute because dec() returns counter value

The Scalable Commutativity Rule: Designing Scalable Software for - PowerPoint PPT Presentation

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors Austin T. Clements Thesis advisors: M. Frans Kaashoek Nickolai Zeldovich Robert Morris Eddie Kohler x86 CPU trends x86 CPU trends 2005 x86 CPU trends

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors Austin T.

Encoding Equivariant Commutativity via Operads David White Denison University Joint with Javier

On the curious commutativity of AMPD matrices Adhemar Bultheel Dept. Computer Science, KU Leuven

Designing for Designing for Greenspace Greenspace Greenspace Designing for Designing for

Class 14 Slides SLIDE what is the designing principle how does designing principle

Rule Changes - Non rule change year Review of 2017 rule changes - just the easy to forgot

Common Rule Advanced Notice of Proposed Rulemaking (ANPRM) IRB Investigator Advanced Notice

2nd RULE: You MUST TALK about BOOK CLUB. 2nd RULE: You DO NOT talk about 3rd RULE: PERSEVERE -- If

Rule #1: Have a takeaway. Rule #2: Keep It Simple. Rule #3: Repetition is Good. Rule #4: Be

Counting Rules, etc Product Rule Generalized Product Rule Division Rule Bijection

Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based

The Chain Rule Given a composite function: The Chain Rule Given a composite function: h ( x ) =

Product and Quotient Rule September 16, 2016 1 Product and Quotient Rule September 16, 2016 2

Designing Your Fashion Portfolio From Concept To Presentation Designing Your Fashion Portfolio

On Testing group commutativity by F.Magniez and A.Nayak Laura Mancinska University of

L-relations and Galois triangles Basic notions Adjoint product Symmetry Commutativity and

NFSv4 Replication for Grid Storage Middleware Peter Honeyman Center for Information Technology

Our cloud is thirsty ! Shaolei Ren Florida International University sren@cs.fiu.edu 1 A

The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org

Adrian Tate Adrian Tate Technical Lead of Scientific Libraries Technical Lead of Scientific

CompSci 514: Computer Networks L18: Datacenter Network Architectures II Xiaowei Yang 1

Topic 2 Current, Voltage and Power Prof Peter Cheung Dyson School of Design Engineering

Interprocess Communication Chester Rebeiro IIT Madras 1 Virtual Memory View During

CS 889 Advanced Topics in Human- Computer Interaction RepliCHI Scheduling Friday classes