Building LLVM-IR Johannes Doerfert and Christoph Mallon Saarbr - - PowerPoint PPT Presentation

building llvm ir
SMART_READER_LITE
LIVE PREVIEW

Building LLVM-IR Johannes Doerfert and Christoph Mallon Saarbr - - PowerPoint PPT Presentation

Building LLVM-IR Johannes Doerfert and Christoph Mallon Saarbr ucken Graduate School of Computer Science Saarland University Saarbr ucken, Germany December 18, 2013 C4 2 / 14 C4 Frontend Lexer Parser C-Code Token Stream AST 2 /


slide-1
SLIDE 1

Building LLVM-IR

Johannes Doerfert and Christoph Mallon

Saarbr¨ ucken Graduate School of Computer Science Saarland University Saarbr¨ ucken, Germany

December 18, 2013

slide-2
SLIDE 2

C4

2 / 14

slide-3
SLIDE 3

C4

Frontend C-Code Token Stream Lexer AST Parser

2 / 14

slide-4
SLIDE 4

C4

Frontend C-Code Token Stream Lexer AST Parser Middle-end LLVM-IR LLVM-IR

Optimization

LLVM-IR

Optimization

2 / 14

slide-5
SLIDE 5

C4

Frontend C-Code Token Stream Lexer AST Parser Middle-end LLVM-IR LLVM-IR

Optimization

LLVM-IR

Optimization

Backend X86 ARM ...

2 / 14

slide-6
SLIDE 6

LLVM-IR

3 / 14

slide-7
SLIDE 7

LLVM-IR

◮ Low-level, “typed”, SSA-based, assembly language ◮ Types are sign agnostic

3 / 14

slide-8
SLIDE 8

LLVM-IR

◮ Low-level, “typed”, SSA-based, assembly language ◮ Types are sign agnostic ◮ Hierarchical structured

3 / 14

slide-9
SLIDE 9

LLVM-IR

◮ Low-level, “typed”, SSA-based, assembly language ◮ Types are sign agnostic ◮ Hierarchical structured

◮ Module ◮ Global Variables ◮ Composite Types (“structs”) ◮ Function Declarations ◮ Function Definitions 3 / 14

slide-10
SLIDE 10

LLVM-IR

◮ Low-level, “typed”, SSA-based, assembly language ◮ Types are sign agnostic ◮ Hierarchical structured

◮ Module ◮ Global Variables ◮ Composite Types (“structs”) ◮ Function Declarations ◮ Function Definitions ◮ Function Definition ◮ Basic Blocks 3 / 14

slide-11
SLIDE 11

LLVM-IR

◮ Low-level, “typed”, SSA-based, assembly language ◮ Types are sign agnostic ◮ Hierarchical structured

◮ Module ◮ Global Variables ◮ Composite Types (“structs”) ◮ Function Declarations ◮ Function Definitions ◮ Function Definition ◮ Basic Blocks ◮ Basic Blocks ◮ Instructions 3 / 14

slide-12
SLIDE 12

LLVM-IR

Instructions

4 / 14

slide-13
SLIDE 13

LLVM-IR

Instructions

%sum = add i32 4, %var ; Binary operations %cmp = icmp sge i32 %a, %b

4 / 14

slide-14
SLIDE 14

LLVM-IR

Instructions

%sum = add i32 4, %var ; Binary operations %cmp = icmp sge i32 %a, %b %value = load i32* %location ; Memory operations store i32 %value, i32* %location

4 / 14

slide-15
SLIDE 15

LLVM-IR

Instructions

%sum = add i32 4, %var ; Binary operations %cmp = icmp sge i32 %a, %b %value = load i32* %location ; Memory operations store i32 %value, i32* %location br label %next-block ; Terminator Instructions br i1 %cmp, label %then-block, label %else-block ret i32 %a

4 / 14

slide-16
SLIDE 16

LLVM-IR

Instructions

%sum = add i32 4, %var ; Binary operations %cmp = icmp sge i32 %a, %b %value = load i32* %location ; Memory operations store i32 %value, i32* %location br label %next-block ; Terminator Instructions br i1 %cmp, label %then-block, label %else-block ret i32 %a %X = trunc i32 257 to i8 ; Cast Instructions %Y = sext i32 %V to i64 %Z = bitcast i8* %x to i32*

4 / 14

slide-17
SLIDE 17

LLVM-IR

Instructions

%sum = add i32 4, %var ; Binary operations %cmp = icmp sge i32 %a, %b %value = load i32* %location ; Memory operations store i32 %value, i32* %location br label %next-block ; Terminator Instructions br i1 %cmp, label %then-block, label %else-block ret i32 %a %X = trunc i32 257 to i8 ; Cast Instructions %Y = sext i32 %V to i64 %Z = bitcast i8* %x to i32* %ptr = alloca i32 ; Other Instructions %ret = call i32 @foo(i8* %fmt, i32 %val) %phi = phi i32 [ %value-a, %block-a ], [ %value-b, %block-b ] %I-th-element-addr = getelementptr i8** %Array, i64 %I

4 / 14

slide-18
SLIDE 18

LLVM-IR

Basic Blocks

5 / 14

slide-19
SLIDE 19

LLVM-IR

Basic Blocks

int max(int a, int b) { if (b < a) { return a; } else { return b; } }

5 / 14

slide-20
SLIDE 20

LLVM-IR

Basic Blocks

if-header: %A = load i32* %Aptr %B = load i32* %Bptr %if-condition = icmp slt i32 %B, %A br i1 %if-condition, label %if-cons, label %if-alt if-cons: ; preds = %if-header %A2 = load i32* %Aptr ret i32 %A2 if-alt: ; preds = %if-header %B2 = load i32* %Bptr ret i32 %B2

5 / 14

slide-21
SLIDE 21

LLVM-IR

Functions

6 / 14

slide-22
SLIDE 22

LLVM-IR

Functions

define i32 @max(i32 %a, i32 %b) { entry: %0 = alloca i32 %1 = alloca i32 store i32 %a, i32* %1 store i32 %b, i32* %0 br label %if-header if-header: ; preds = %entry %2 = load i32* %0 %3 = load i32* %1 %if-condition = icmp slt i32 %2, %3 br i1 %if-condition, label %if-cons, label %if-alt if-con: ; preds = %if-header %4 = load i32* %1 ret i32 %4 if-alt: ; preds = %if-header %5 = load i32* %0 ret i32 %5 }

6 / 14

slide-23
SLIDE 23

LLVM-IR

The human readable form

◮ Globals start with an “@” ◮ Local values start with an “%” ◮ Basic block names start with an “%” when used ◮ Basic block names end with an “:” when definied ◮ There is always “label” written before the branch target ◮ Indention is not important (but nice)

7 / 14

slide-24
SLIDE 24

LLVM-IR

Ill-formed Examples

8 / 14

slide-25
SLIDE 25

LLVM-IR

Ill-formed Examples

; Declaration needs to strictly dominate use %x = add i32 1, %x

8 / 14

slide-26
SLIDE 26

LLVM-IR

Ill-formed Examples

; Declaration needs to strictly dominate use %x = add i32 1, %x ; Binary operations need equal types %valI32 = load i32 *P %valI64 = load i64 *Q %val = add i32 %valI32, %valI64

8 / 14

slide-27
SLIDE 27

LLVM-IR

Ill-formed Examples

; Declaration needs to strictly dominate use %x = add i32 1, %x ; Binary operations need equal types %valI32 = load i32 *P %valI64 = load i64 *Q %val = add i32 %valI32, %valI64 ; A basic block *needs* a terminator (branch or return) block: %VAL = add i32 %A, %B

8 / 14

slide-28
SLIDE 28

LLVM-IR

Ill-formed Examples

; Declaration needs to strictly dominate use %x = add i32 1, %x ; Binary operations need equal types %valI32 = load i32 *P %valI64 = load i64 *Q %val = add i32 %valI32, %valI64 ; A basic block *needs* a terminator (branch or return) block: %VAL = add i32 %A, %B ; A terminator *ends* a basic block block: ret i32 %A %VAL = add i32 %A, %B

8 / 14

slide-29
SLIDE 29

LLVM-IR

Command line

Generate (human readable) LLVM-IR from C/C++ input: clang -emit-llvm -c -S -o OUT.ll IN.c Apply an optimization on LLVM-IR:

  • pt -S -o OUT.ll IN.ll

Execute (via JIT) an LLVM-IR module: lli IN.ll <argv arguments> Create a binary from an LLVM-IR module: clang -o OUT IN.ll Create architecture specific assembly: llc -o OUT.s IN.ll Create a binary from architecture specific assembly: cc -o OUT IN.s Create API calls for an LLVM-IR module: llc -march=cpp -o OUT.cpp IN.ll Get more help: <TOOL> --help

9 / 14

slide-30
SLIDE 30

LLVM-IR

Documentation

General language reference manual: http://llvm.org/docs/LangRef.html Doxygen code documentation (auto generated): http://llvm.org/docs/doxygen/html/index.html Stable binaries and source code (llvm + clang): http://llvm.org/releases/download.html#3.3 Full command line tools guide: http://llvm.org/docs/CommandGuide/

10 / 14

slide-31
SLIDE 31

LLVM-IR

API classes Value Argument Constant Global Var Constant Int/FP Functions Instruction

  • Bin. Inst.

Load Inst. ...

11 / 14

slide-32
SLIDE 32

LLVM-IR

API classes Type Integer Type Function Type Composite Type Struct Type Pointer Type

11 / 14

slide-33
SLIDE 33

LLVM

Final Notes

12 / 14

slide-34
SLIDE 34

LLVM

Final Notes

◮ LLVM is HUGE ◮ LLVM is grown, the interface is not consitent

12 / 14

slide-35
SLIDE 35

LLVM

Final Notes

◮ LLVM is HUGE ◮ LLVM is grown, the interface is not consitent

Keep it Simple and Straightforward

12 / 14

slide-36
SLIDE 36

Doxygen + Examples

13 / 14

slide-37
SLIDE 37

int max(int a, int b) { if (b < a) { return a; } else { return b; } }

14 / 14

slide-38
SLIDE 38

; ModuleID = ’max.c’

14 / 14

slide-39
SLIDE 39

; ModuleID = ’max.c’ @X = global i32 0

14 / 14

slide-40
SLIDE 40

; ModuleID = ’max.c’ @X = global i32 0 define i32 @max(i32 %a, i32 %b) #0 { }

14 / 14

slide-41
SLIDE 41

; ModuleID = ’max.c’ @X = global i32 0 define i32 @max(i32 %a, i32 %b) #0 { } define i32 @main(i32 %argc, i8** %argv) #0 { }

14 / 14

slide-42
SLIDE 42

; ModuleID = ’max.c’ @X = global i32 0 define i32 @max(i32 %a, i32 %b) #0 { } define i32 @main(i32 %argc, i8** %argv) #0 { } attributes #0 = { nounwind uwtable ...

14 / 14

slide-43
SLIDE 43

; ModuleID = ’max.ll’ target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64" target triple = "x86_64-unknown-linux-gnu" @X = global i32 0 ; Function Attrs: nounwind uwtable define i32 @max(i32 %a, i32 %b) # 0 { %1 = icmp slt i32 %b, %a br i1 %1, label %2, label %3 ; <label>:2 ; preds = %0 ret i32 %a ; <label>:3 ; preds = %0 ret i32 %b } ; Function Attrs: nounwind uwtable define i32 @main(i32 %argc, i8** %argv) # 0 { %1 = load i32* @X, align 4 %2 = call i32 @max(i32 %argc, i32 %1) ret i32 %2 } a t t r i b u t e s # 0 = { nounwind u w t a b l e "no-frame-pointer-elim"="true" } 14 / 14