A New Architecture for Building Software Daniel Dunbar Overview - - PowerPoint PPT Presentation

a new architecture for building software
SMART_READER_LITE
LIVE PREVIEW

A New Architecture for Building Software Daniel Dunbar Overview - - PowerPoint PPT Presentation

A New Architecture for Building Software Daniel Dunbar Overview Compile time How software is built llbuild A new architecture Compile Time Clang & Compile Times Designed to be a fast compiler Tuned lex & parse


slide-1
SLIDE 1

A New Architecture for Building Software

Daniel Dunbar

slide-2
SLIDE 2

Overview

  • Compile time
  • How software is built
  • llbuild
  • A new architecture
slide-3
SLIDE 3

Compile Time

slide-4
SLIDE 4

Clang & Compile Times

  • Designed to be a fast compiler
  • Tuned lex & parse
  • Low-overhead -O0 path
  • Redesigned PCH implementation
  • Integrated assembler
  • Very successful
slide-5
SLIDE 5

Keeping Up With Compile Time

  • Performance regresses
  • Features are added & tuning can break
  • Optimizing Clang is hard
  • Occasional big wins
  • Bootstrap with link-time optimization
  • Enable order files
  • Modules
  • Fewer architectural wins

Arm64 -O0

0.8 0.9 1 1.1 1.2 1.3 1.4 1.5

clang-700 clang-800 clang TOT

slide-6
SLIDE 6

Improving Compile Time

  • Distributed compilation
  • Fancy caching
  • Ideally distributed & shared
  • Do less work
  • … a lot less work
  • … ideally, O(N) less work

(this talk) Clang calls stat() an average of 324 times for each input file during the course of a Clang build.

slide-7
SLIDE 7

What If I Told You…

  • 15% faster at type checking…
  • … without any work!
slide-8
SLIDE 8

Frontend Source Sharing

  • Clang frontend can process multiple TUs
  • Shares file & source managers
  • Works today
  • … 85% faster with modules on

Cocoa Type Check

W/O MODULES W/ MODULES

clang -fsyntax-only -x objective-c /dev/null \

  • Xclang t.m -Xclang t.m -Xclang t.m -Xclang t.m -Xclang t.m \
  • Xclang t.m -Xclang t.m -Xclang t.m -Xclang t.m -Xclang t.m
slide-9
SLIDE 9

Precompiled Preamble

  • Used in libclang for interactive editing
  • Automatically build PCH for “preamble”
  • Automatically reuse preamble when

unchanged

CGCleanup Compile

W/O MODULES W/ MODULES

slide-10
SLIDE 10

Let’s Do It!

  • Seems easy…
  • Shared compile flags? Reuse frontend!
  • Hotly edited file? Cache preamble!
  • Uh oh!
  • No control over compiler invocation
  • Maybe if there was a compiler service…
  • There must be a better way!
slide-11
SLIDE 11

How Software Is Built

slide-12
SLIDE 12

How Software Is Built

  • Traditional UNIX compiler/build system model
  • Compiler runs as separate process
  • Primitive mechanisms for communicating dependencies
  • Fixed input/output pipeline defined by command line
  • This is an API …
  • … and we haven’t changed it in decades
  • We ❤ breaking APIs

Did I hear API???

slide-13
SLIDE 13

How Software Could Be Built

  • Earlier examples are only the tip of the iceberg
  • Ad hoc lookup tables
  • Early exit via output signatures
  • Redundant template instantiations
  • Need ability to evolve build system/compiler API
  • These changes need to be easy
slide-14
SLIDE 14

What About The Module Cache?

  • Clang’s module cache solves this problem
  • Automatically builds modules when needed
  • Shares result across build
  • No build system changes required
slide-15
SLIDE 15

An Nonexample: Module Cache

  • Significant implementation complexity
  • File locking for coordination
  • Custom cache consistency management, few debugging tools
  • Custom cache eviction implementation (automatic pruning, tuning

parameters)

  • Opaque to build system scheduler
slide-16
SLIDE 16

Ideal Model for Building Software

  • Support a flexible API between the compiler & build system
  • Goals:
  • Easy to share redundant work
  • Compiler can optimize for entire build
  • Build system can optimize via rich compiler API
  • Consistent incremental builds & debuggable architecture
slide-17
SLIDE 17

Ideal Model for Building Software

  • Need ability to integrate build system and compiler
  • Requires:
  • Library-based compiler
  • Extensible build system
  • Compiler plugin

✅ ❌ ❌

slide-18
SLIDE 18

llbuild

slide-19
SLIDE 19

Introducing llbuild

  • llbuild is a new C++ library for building build systems
  • Uses LLVM ADT/Support & a library-based design philosophy
  • Open sourced as part of Swift project
  • Used in the Swift Package Manager
  • … and Swift Playgrounds
  • Contains a Ninja implementation
slide-20
SLIDE 20

llbuild Goals

  • Ignore build description / input language
  • Focus on building a powerful engine
  • Support work being discovered on the fly
  • Scale to millions of tasks
  • Sophisticated scheduling
  • Powerful debugging tools
  • Support a pluggable task API
slide-21
SLIDE 21

llbuild Architecture

  • Flexible underlying core engine
  • Library for persistent, incremental computation
  • Heavily inspired by a Haskell build system called Shake
  • Low-level
  • Inputs & outputs are byte-strings
  • Functions are abstract
  • Use C++ API between tasks
  • Higher-level build systems are built on the core
slide-22
SLIDE 22

llbuild Engine

  • Minimal, functional model
  • Key: Unambiguous name for a computation
  • Value: The result of a computation
  • Rule: How to produce a Value for a Key
  • Task: A running instance of a Rule
  • A task can request other input Keys as

part of its work

llbuild make/ninja Key /a/b.o Value stat(“/a/b.o”) Rule /a/b.o: /a/b.c Task fork/exec

slide-23
SLIDE 23

auto ack(int m, int n) -> int { if (m == 0) { return n + 1; } else if (n == 0) { return ack(m - 1, 1); } else { return ack(m - 1, ack(m, n - 1)); }; }

An Example: Recursive Functions

  • Core engine can be used directly for general computation
  • Recursive functions form a natural graph
  • Each result depends on the recursive inputs
  • Let’s build Ackermann!

auto ack(int m, int n) -> int { if (m == 0) { return n + 1; } else if (n == 0) { return ack(m - 1, 1); } else { return ack(m - 1, ack(m, n - 1)); }; } auto ack(int m, int n) -> int { if (m == 0) { return n + 1; } else if (n == 0) { return ack(m - 1, 1); } else { return ack(m - 1, ack(m, n - 1)); }; } auto ack(int m, int n) -> int { if (m == 0) { return n + 1; } else if (n == 0) { return ack(m - 1, 1); } else { return ack(m - 1, ack(m, n - 1)); }; }

slide-24
SLIDE 24

“Building” Ackermann

  • Computing Ackermann with llbuild:
  • Encode function invocation as key: ack(3,14)
  • Encode integer result as value
  • Rules map keys like ack(3,14) to a task
  • Tasks implement the Ackermann function
slide-25
SLIDE 25

Ackermann: Keys

#include "llbuild/Core/BuildEngine.h" using namespace llbuild; /// Key representation used in Ackermann build. struct AckermannKey { /// The Ackermann number this key represents. int m, n; /// Create a key representing the given Ackermann number. AckermannKey(int m, int n) : m(m), n(n) {} /// Create an Ackermann key from the encoded representation. AckermannKey(const core::KeyType& key) { … } /// Convert an Ackermann key to its encoded representation.

  • perator core::KeyType() const { … }

};

slide-26
SLIDE 26

Ackermann: Values

/// Value representation used in Ackermann build. struct AckermannValue { /// The wrapped value. int value; /// Create a value from an integer. AckermannValue(int value) : value(value) { } /// Create a value from the encoded representation. AckermannValue(const core::ValueType& value) : value(intFromValue(value)) { } /// Convert a value to its encoded representation.

  • perator core::ValueType() const { … }

};

slide-27
SLIDE 27

Ackermann: Rules

/// An Ackermann delegate which dynamically constructs rules like "ack(m,n)". class AckermannDelegate : public core::BuildEngineDelegate { public: /// Get the rule to use for the given Key. virtual core::Rule lookupRule(const core::KeyType& keyData) override { auto key = AckermannKey(keyData); return core::Rule{key, [key] (core::BuildEngine& engine) { return new AckermannTask(engine, key.m, key.n); } }; } /// Called when a cycle is detected by the build engine and it cannot make /// forward progress. virtual void cycleDetected(const std::vector<core::Rule*>& items) override { … } };

slide-28
SLIDE 28

Ackermann: Tasks

/// Compute the result for an individual Ackermann number. struct AckermannTask : core::Task { int m, n; AckermannValue recursiveResultA, recursiveResultB; AckermannTask(core::BuildEngine& engine, int m, int n) : m(m), n(n) { engine.registerTask(this); } /// Called when the task is started. virtual void start(…) override { … } /// Called when a task’s requested input is available. virtual void provideValue(…) override { … } /// Called when all inputs are available. virtual void inputsAvailable(…) override { … } };

slide-29
SLIDE 29

Ackermann: Tasks

/// Compute the result for an individual Ackermann number. struct AckermannTask : core::Task { … /// Called when the task is started. virtual void start(core::BuildEngine& engine) override { // Request the first recursive result, if necessary. if (m == 0) { ; } else if (n == 0) { engine.taskNeedsInput(this, AckermannKey(m-1, 1), 0); } else { engine.taskNeedsInput(this, AckermannKey(m, n-1), 0); } } … }

A(m,n) =

if m = 0 if m > 0 and n = 0 if m > 0 and n > 0

{

n+1 A(m-1, 1) A(m-1, A(m-1, n-1))

slide-30
SLIDE 30

Ackermann: Tasks

/// Compute the result for an individual Ackermann number. struct AckermannTask : core::Task { … /// Called when a task’s requested input is available. virtual void provideValue(core::BuildEngine& engine, uintptr_t inputID, const core::ValueType& value) override { if (inputID == 0) { recursiveResultA = value; // Request the second recursive result, if needed. if (m > 0 && n > 0) { engine.taskNeedsInput(this, AckermannKey(m-1, recursiveResultA), 1); } } else { recursiveResultB = value; } } … }

A(m,n) =

if m = 0 if m > 0 and n = 0 if m > 0 and n > 0

{

n+1 A(m-1, 1) A(m-1, A(m-1, n-1))

slide-31
SLIDE 31

Ackermann: Tasks

/// Compute the result for an individual Ackermann number. struct AckermannTask : core::Task { … /// Called when all inputs are available. virtual void inputsAvailable(core::BuildEngine& engine) override { if (m == 0) { engine.taskIsComplete(this, AckermannValue(n + 1)); return; } if (n == 0) { engine.taskIsComplete(this, recursiveResultA); return; } engine.taskIsComplete(this, recursiveResultB); } };

A(m,n) =

if m = 0 if m > 0 and n = 0 if m > 0 and n > 0

{

n+1 A(m-1, 1) A(m-1, A(m-1, n-1))

slide-32
SLIDE 32

`

/// Compute an Ackermann number using llbuild. void runAckermannBuild(int m, int n) { /// Create the build engine delegate. AckermannDelegate delegate; /// Create the engine. core::BuildEngine engine(delegate); /// Build and report the result. auto result = AckermannValue(engine.build(AckermannKey(m, n))); llvm::errs() << "ack(" << m << ", " << n << ") = " << result << "\n"; }

$ time llbuild buildengine ack 3 14 ack(3, 14) = 131069 ... computed using 327685 rules real 0m1.056s user 0m0.925s sys 0m0.116s

42 times more rules than LLVM + Clang

slide-33
SLIDE 33

llbuild Performance

  • Wall times for full parallel build
  • Two test projects:
  • llbuild self-host
  • LLVM (x86 only)

75 150 225 300 Self-host LLVM

ninja llbuild

slide-34
SLIDE 34

llbuild Performance

  • Wall times for null build
  • Two test projects:
  • llbuild itself
  • LLVM (x86 only)

0.1 0.2 0.3 0.4 0.5 Self-host LLVM

ninja llbuild

slide-35
SLIDE 35

llbuild Scalability

  • Designed to scale to large graphs
  • Validate by looking for linear performance vs size
  • Experiments done using the Ackermann function
slide-36
SLIDE 36

llbuild Scalability

0.0 1.0 2.0 3.0 4.0 5.0 3 5 , 7 , 1 , 5 , 1 , 4 ,

Initial Build (s) Null Build (s) Memory Use (100 MiBs)

slide-37
SLIDE 37

A New Architecture

slide-38
SLIDE 38

A New Architecture

  • Requires:
  • Library-based compiler
  • Extensible build system
  • Compiler plugin

✅ ✅ ❌

slide-39
SLIDE 39

Clang Compiler Plugin

  • A straw man proposal
  • Focus on easiest path to vet concept
  • Add a minimal new protocol for controllable compiler subprocess
  • Use JSON (etc.) to send & receive commands
  • Share subprocesses when available
  • Dispatch individual compile requests as they arrive
  • Restart subprocess on crashes, etc.
slide-40
SLIDE 40

}

Lane 2 Lane 1 Lane 4 Lane 3

Current Model

llbuild

cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

slide-41
SLIDE 41

}

Lane 2 Lane 1 Lane 4 Lane 3

Proposed Shared Frontend

llbuild

cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

slide-42
SLIDE 42
  • Enables file & source manager sharing
  • Amortizes module validation time
  • Avoids need to make full compiler thread safe
  • Gives us a new API to break!

Proposed Shared Frontend

slide-43
SLIDE 43
  • The current compiler / build system split is a legacy API
  • Potentially large compile time wins by evolving
  • llbuild: https://github.com/apple/swift-llbuild
  • As Ninja: llbuild ninja build (or ln -s llbuild ninja)
  • Docs: https://github.com/apple/swift-llbuild/tree/master/docs
  • Ackermann: lib/Commands/BuildEngineCommand.cpp

Summary

slide-44
SLIDE 44

This Slide Intentionally Left Blank