TO Y PROGRAMMING Demo of a repository for statically compiled - - PowerPoint PPT Presentation

to y
SMART_READER_LITE
LIVE PREVIEW

TO Y PROGRAMMING Demo of a repository for statically compiled - - PowerPoint PPT Presentation

PUBLIC Sony Interactive Entertainment TO Y PROGRAMMING Demo of a repository for statically compiled programs 2016 US LLVM Developers Meeting Paul Bowen-Huggett paul.huggett@sony.com Agenda Background RFC Is the idea generally


slide-1
SLIDE 1 Sony Interactive Entertainment

PUBLIC

Paul Bowen-Huggett paul.huggett@sony.com

“Demo of a repository for statically compiled programs” 2016 US LLVM Developers’ Meeting

PROGRAMMING

TO Y

slide-2
SLIDE 2

Agenda

Background

slide-3
SLIDE 3

RFC

  • Is the idea generally sound? Obvious improvements?
  • Is it something we should think about for LLVM?
  • There are several potentially related projects (C++ modules IFC,

compilation database, ThinLTO, etc.) Views from respective owners?

slide-4
SLIDE 4
slide-5
SLIDE 5

Front-/Back-end time ratio 0% 20% 40% 60% 80% 100% Source files

Median Time Front-end Back-end Release 19% 81% Debug

40% 60%

Chromium Browser Build Ratios

Back-end Front-end (Release) Front-end (Debug)

slide-6
SLIDE 6

Size (bytes) 10 100 1,000 10,000 Number of instances 100 200 300 400 500

Chromium Browser COMDAT Groups

Number of instances Generated Discarded 577,397 576,233 99.8%

slide-7
SLIDE 7
  • Toy programming language
  • Available on github: https://github.com/SNSystems/Toy-tools
  • Command line tools:

Toy Tools

Role Name Compiler toycc Linker toyld Debugger toydb Runtime toyvm Role Name Garbage Collector toygc Strip toystrip Merge toymerge

slide-8
SLIDE 8

toycc main.x toycc a.toy b.toy b.o a.o 🐜 toydb toyvm 🔘 toyld toygc

slide-9
SLIDE 9

Limitations

  • 1. It’s just a toy!
  • 2. Written in Python (3.5)
  • 3. Output files are YAML
  • 4. No concurrency
  • 5. No backward compatibility
  • 6. Says nothing about performance
  • 7. The Toy language is nothing like

C++:

  • VM has no registers, 3 stacks
  • Dynamic language, no user-

defined types, no vague linkage…

slide-10
SLIDE 10

Demo

  • 1. “Hello, World”
  • 2. “Modules”
  • 3. “Distributed”
slide-11
SLIDE 11 “fragments” table “tickets” table “ticket” files UUID1 main.o sieve.o UUID2 factorial.o UUID3 Value Key UUID3 UUID2 UUID1 “main” d(f1) digest name “sieve” d(f2) digest name “fact3” d(f4) “factorial” d(f3) digest name d(f4) … d(f3) … d(f2) d(f1) Value Key 01 7a 52 00 01 78 10 01 .eh_frame [] .text 55 48 89 e5 48 83 [] type external fixups internal fixups binary .text+0-x2f .text+0x19 “fact3”+0 “sieve”+0 .text 55 48 89 e5 48 83 [] type external fixups internal fixups binary “factorial”+0 factorial.oʹ UUID4 UUID4 “fact3” d(f4) “factorial” d(f3) digest name d(f5) .text 66 4e 89 e5 48 83 [] type external fixups internal fixups binary “factorial”+0

e n t r y p

  • i

n t

slide-12
SLIDE 12 “fragments” table

e n t r y p

  • i

n t

d(f4) … d(f3) … d(f2) d(f1) Value Key 01 7a 52 00 01 78 10 01 .eh_frame [] .text 55 48 89 e5 48 83 [] type external fixups internal fixups binary .text+0-x2f .text+0x19 “fact3”+0 “sieve”+0 .text 55 48 89 e5 48 83 [] type external fixups internal fixups binary “factorial”+0 d(f5) .text 66 4e 89 e5 48 83 [] type external fixups internal fixups binary “factorial”+0
slide-13
SLIDE 13

Should the repository be a network resource?

Distributed Builds

slide-14
SLIDE 14

Distributed Build

strip merge Agent 1 S1 T1 S2 T2 compile T1 S2 compile T2 S1 Agent 2 S3 T3 S4 T4 compile T3 S4 compile T4 S3 🔘
 link binary

slide-15
SLIDE 15

Challenges?

  • Remember, it’s just a toy… Need a production-ready C++ implementation
  • Understand real-world growth rates and GC strategy
  • Doesn’t show solutions to:
  • Fast storage with efficient indices
  • LLVM IR hashing
  • Efficient debug type references
slide-16
SLIDE 16

Conclusion

  • Potential to reduce re-compile

times by ~60% (“speed-of-light” based on Chrome Debug)

  • Small code changes benefit the

most

  • No source code changes
  • Eradicate duplication and

redundancy at source: minimize link-time processing and copying

  • (Almost) No change to workflows

  • Next steps:
  • Data store (in-process, memory-

mapped hash tables)

  • Prototype:
  • IR hashing
  • MC back-end
  • Repository-based linker
slide-17
SLIDE 17

Q&A