Modular Codegen Further Benefits of Explicit Modularization Module - - PowerPoint PPT Presentation

modular codegen
SMART_READER_LITE
LIVE PREVIEW

Modular Codegen Further Benefits of Explicit Modularization Module - - PowerPoint PPT Presentation

Modular Codegen Further Benefits of Explicit Modularization Module Flavours #ifndef FOO_H Motivating #define FOO_H Example inline void foo() { ... } #endif #include "foo.h" #include "foo.h" void bar() { void baz() {


slide-1
SLIDE 1

Modular Codegen

Further Benefits of Explicit Modularization

slide-2
SLIDE 2

Module Flavours

slide-3
SLIDE 3

Motivating Example

#ifndef FOO_H #define FOO_H inline void foo() { ... } #endif #include "foo.h" void bar() { foo(); } #include "foo.h" void baz() { foo(); }

slide-4
SLIDE 4

Implicit Modules

  • User writes .modulemap files

module foo { header "foo.h" export * }

slide-5
SLIDE 5

Implicit Modules Build Process

bar.cpp

clang++

baz.cpp

clang++ Build System

slide-6
SLIDE 6

Implicit Modules Build Process

bar.cpp foo.mm

clang++

baz.cpp

clang++ clang++

foo.pcm

Build System

slide-7
SLIDE 7

Implicit Modules Build Process

bar.cpp foo.mm

clang++

bar.o baz.cpp

clang++

baz.o

clang++

foo.pcm

Build System

slide-8
SLIDE 8

Implicit Modules

  • User writes .modulemap files
  • Compiler finds them and implicitly builds module descriptions in a

filesystem cache

  • Build system agnostic
  • Difficult to parallelize - build system isn’t aware of the dependencies
  • Doesn’t distribute (clang doesn’t know about distribution scheme)
slide-9
SLIDE 9

Explicit Modules

  • Build system explicitly invokes the compiler on .modulemap files
  • Passes resulting .pcm files when compiling .cpp files for use
slide-10
SLIDE 10

Explicit Modules Build Process

foo.mm

clang++

foo.pcm

Build System

slide-11
SLIDE 11

Explicit Modules Build Process

bar.cpp foo.mm

clang++

bar.o baz.cpp

clang++

baz.o

clang++

foo.pcm

Build System

slide-12
SLIDE 12

Modules TS (Technical Specification)

  • New file type (C++ with some new syntax - .cppm?)
  • New import syntax
  • Also needs build system support
slide-13
SLIDE 13

Modular Codegen

slide-14
SLIDE 14

Duplication in Object Files

Each object file contains independent definitions of:

  • Uninlined ‘inline’ functions (&

some other bits)

  • Debug information descriptions
  • f classes

bar.cpp foo.mm bar.o

  • bar()
  • f1()

baz.cpp foo.pcm baz.o

  • baz()
  • f1()

a.out

  • bar()
  • baz()
  • f1()
slide-15
SLIDE 15

Modular Objects

The module can be used as a ‘home’ for these entities so they don’t need to be carried by every user.

bar.cpp foo.mm bar.o

  • bar()

baz.cpp foo.pcm baz.o

  • baz()

a.out

  • bar()
  • baz()
  • f1()

foo.o

  • f1()
slide-16
SLIDE 16

Risks

Unused entities may increase linker inputs.

bar.cpp foo.mm bar.o

  • bar()

baz.cpp foo.pcm baz.o

  • baz()

a.out

  • bar()
  • baz()
  • f1()

foo.o

  • f1()
  • f2()
  • f3()
slide-17
SLIDE 17

Constraints

  • Headers are compiled separately (& only once) from uses
  • Dependencies must be well formed

○ Headers cannot be implemented by a different library - they form circular dependencies no longer broken by duplicated definitions at every use.

slide-18
SLIDE 18

Diversion: ‘How Unix Linkers Work (lite)’

void a1() { b(); } void a2() { … } void b() { a2(); }

slide-19
SLIDE 19

Diversion: ‘How Unix Linkers Work (lite)’

void a1() { b(); } void a2() { … } void b() { a2(); }

a1()?

slide-20
SLIDE 20

Diversion: ‘How Unix Linkers Work (lite)’

void a1() { b(); } void a2() { … } void b() { a2(); }

a1()? a1()✓ b()?

slide-21
SLIDE 21

Diversion: ‘How Unix Linkers Work (lite)’

void a1() { b(); } void a2() { … } void b() { a2(); }

a1()? a1()✓ b() ? a1()✓ b() ✓ a2()?

slide-22
SLIDE 22

Diversion: ‘How Unix Linkers Work (lite)’

void a1() { b(); } void a2() { … } void b() { a2(); }

a1()? a1()✓ b() ? a1()✓ b() ✓ a2()? a1()✓ b() ✓ a2()❌

slide-23
SLIDE 23

Clang/LLVM Codebase

  • *.def files are textual/non-modular
  • lib/Support/regc* are non-modular
  • MCTargetOptionsCommandFlags.h non-modular
  • CommandFlags.h non-modular
  • Target ASM Parsers depend on MC Target Description
  • static namespace-scope functions in headers -> inline, non-static
  • Missing #includes
  • No idea what to do with abi-breaking.h
  • Weird things in Hexagon (non-modular headers that are included exactly once…)
  • ASTMatchers defining global variables in headers… no idea how this isn’t causing link errors,

maybe they’ve got implicit internal linkage.

slide-24
SLIDE 24

Results

slide-25
SLIDE 25

Object Section Sizes

  • O0 -fmodules-codegen -gsplit-dwarf
slide-26
SLIDE 26

Object Section Sizes

  • O0 -fmodules-codegen -gsplit-dwarf
slide-27
SLIDE 27

Object Section Sizes

  • O0 -fmodules-codegen -gsplit-dwarf
slide-28
SLIDE 28

Object Section Sizes

  • O0 -fmodules-codegen -gsplit-dwarf
slide-29
SLIDE 29
  • O3
slide-30
SLIDE 30

Further Work

  • Other aspects needed for Modules TS

○ Variables (implemented - could be backported to non-TS style, may not be needed) ○ ???

  • Avoid homing alwaysinline functions (maybe other reasonable inlining

heuristics to avoid homing functions unlikely to remain uninlined)

  • Avoid type units when a home is likely to be unique (not an implicit

template instantiation, or has a strong vtable, etc)

slide-31
SLIDE 31

Thanks!

David Blaikie Email/etc: dblaikie@gmail.com Twitter: @dwblaikie

slide-32
SLIDE 32

20%

Use this slide to show a major stat. It can help enforce the presentation’s main message or argument.

slide-33
SLIDE 33

This is the most important takeaway that everyone has to remember.

slide-34
SLIDE 34

Final point

A one-line description of it

slide-35
SLIDE 35

“This is a super-important quote”

  • From an expert