TALx86: A Realistic Typed Assembly Language TALx86: A Realistic - - PowerPoint PPT Presentation

talx86 a realistic typed assembly language talx86 a
SMART_READER_LITE
LIVE PREVIEW

TALx86: A Realistic Typed Assembly Language TALx86: A Realistic - - PowerPoint PPT Presentation

TALx86: A Realistic Typed Assembly Language TALx86: A Realistic Typed Assembly Language Dan Grossman, Fred Smith Cornell University Joint work with: Greg Morrisett, Karl Crary (CMU), Neal Glew, Richard Samuels, Dave Walker, Stephanie Weirich,


slide-1
SLIDE 1

TALx86: A Realistic Typed Assembly Language TALx86: A Realistic Typed Assembly Language

Dan Grossman, Fred Smith

Cornell University

Joint work with: Greg Morrisett, Karl Crary (CMU), Neal Glew, Richard Samuels, Dave Walker, Stephanie Weirich, Steve Zdancewic

slide-2
SLIDE 2

1 May 1999 www.cs.cornell.edu/talc 2

Everyone wants extensibility: Everyone wants extensibility:

  • Web browser
  • applets, plug-ins
  • OS Kernel
  • packet filters, device drivers
  • “Active” networks
  • service routines
  • Databases
  • extensible ADTs
slide-3
SLIDE 3

1 May 1999 www.cs.cornell.edu/talc 3

The Language Approach The Language Approach Extension is written in a “safe” language:

  • Java, Modula-3, ML, Scheme
  • Key point: language provides abstractions
  • ADTs, closures, objects, modules, etc.
  • Can be used to build fine-grained capabilities

Host ensures code respects abstractions:

  • Static checking (verification)
  • Inserting dynamic checks (code-rewriting)
slide-4
SLIDE 4

1 May 1999 www.cs.cornell.edu/talc 4

Example: Your Web Browser Example: Your Web Browser

Java Source javac JVM bytecode JVM verifier System Interface Binary Optimizer Low-Level IL System Binary

Browser

slide-5
SLIDE 5

1 May 1999 www.cs.cornell.edu/talc 5

JVM Pros & Cons JVM Pros & Cons Pros:

  • Portability
  • Hype: $, tools, libraries, books, training

Cons:

  • Performance
  • unsatisfying, even with off-line compilation
  • Only really suitable for Java (or slight variants):
  • relatively high-level instructions tailored to Java
  • type system is Java specific
  • and...
slide-6
SLIDE 6

1 May 1999 www.cs.cornell.edu/talc 6

Large Trusted Computing Base Large Trusted Computing Base

JVM verifier System Interface Binary Optimizer Low-Level IL System Binary

Browser

Good code ⇒ big optimizer ⇒ bugs in optimizer Lots of “native” methods (i.e., not-safe code) No complete formal model Must insert right checks

slide-7
SLIDE 7

1 May 1999 www.cs.cornell.edu/talc 7

Ideally: Ideally:

Your favorite language Low-Level IL

  • ptimizer

machine code verifier System Interface System Binary

trusted computing base

slide-8
SLIDE 8

1 May 1999 www.cs.cornell.edu/talc 8

The Types Way The Types Way

Your safe language Typed Low-Level IL Typed

  • ptimizer

TAL

verifier System Interface System Binary

trusted computing base

  • Verifier is a type-checker
  • Type system flexible and

expressive

  • A useful instantiation of the

“proof carrying code” framework

slide-9
SLIDE 9

1 May 1999 www.cs.cornell.edu/talc 9

TALx86 in a Nutshell TALx86 in a Nutshell

  • Most of the IA32 80x86 flat model

assembly language

  • Memory management primitives
  • Sound type system
  • Types for code, stacks, structs
  • Other advanced features
  • Future work (what we can’t do yet)
slide-10
SLIDE 10

1 May 1999 www.cs.cornell.edu/talc 10

TALx86 Basics: TALx86 Basics: Primitive types: (e.g., int) Code types: {r1:τ1,…,rn:τn}

  • “I’m code that requires register ri to have

type τi before you can jump to me.”

  • Code blocks are annotated with their types
  • Think pre-condition
  • Verify block assuming pre-condition
slide-11
SLIDE 11

1 May 1999 www.cs.cornell.edu/talc 11

Sample Loop Sample Loop

TAL sketch:

<n and retn addr as input> sum: <type> <initialize s> loop: <type> <add to s, decrement n> test: <type> <return if n is 0>

C: int sum(int n){ int s=0; while(!n) { s+=n;

  • -n;

} return n; }

slide-12
SLIDE 12

1 May 1999 www.cs.cornell.edu/talc 12

Verification Verification

sum: {ecx:int, ebx:{edx:int}} mov eax,0 jmp test loop:{ecx:int, ebx:{edx:int}, eax:int} add eax,ecx

{ecx:int, ebx:{edx:int}, eax:int}

dec ecx test:{ecx:int, ebx:{edx:int}, eax:int} cmp ecx,0 jne loop mov edx,eax {ecx:int, ebx:{edx:int}, eax:int, edx:int} jmp ebx

{ecx:int, ebx:{edx:int}, eax:int} OK: sub-type of type labeling test {ecx:int, ebx:{edx:int}, eax:int} OK: sub-type of type labeling next block {ecx:int, ebx:{edx:int}, eax:int} OK: sub-type of type labeling loop OK: sub-type of {edx:int} -- type of ebx

slide-13
SLIDE 13

1 May 1999 www.cs.cornell.edu/talc 13

Stacks & Procedures Stacks & Procedures

Stack Types (lists): σ ::= nil | τ::σ | ρ where ρ is a stack type variable.

Examples using C calling convention: int square(int); int mult(int,int); ∀ρ1 {esp: τ1::int::ρ1} ∀ρ2 {esp: τ2::int::int::ρ2} where where τ1={eax: int, esp: int::ρ1} τ2={eax: int, esp: int::int::ρ2}

slide-14
SLIDE 14

1 May 1999 www.cs.cornell.edu/talc 14

Stacks & Verification Stacks & Verification

square: ∀ρ1 {esp: τ1::int::ρ1} where τ1={eax: int, esp: int::ρ1} push [esp+4] push [esp+8] call mult[with ρ2 = τ1::int::ρ1] add esp,8 retn mult: ∀ρ2 {esp: τ2::int::int::ρ2} where τ2={eax: int, esp: int::int::ρ2} τaft={eax:int, esp: int::int::τ1::int::ρ1} {esp: τ2::int::int::τ1::int::ρ1} where τ2={eax: int, esp: int::int::τ1::int::ρ1}

int int

ρ1

τ1 int τaft

slide-15
SLIDE 15

1 May 1999 www.cs.cornell.edu/talc 15

Important Properties Important Properties

  • Abstraction

“Because the type of the rest of the stack is abstract the callee cannot read/write this portion of the stack”

  • Flexibility

Can encode and enforce many calling conventions (stack shape on return, callee-save, tail calls, etc.)

slide-16
SLIDE 16

1 May 1999 www.cs.cornell.edu/talc 16

Callee Callee-

  • Save Example

Save Example mult:

∀α ∀ρ2 {ebp: α, esp: τ2::int::int::ρ2} where τ2={ebp: α, eax: int, esp: int::int::ρ2}

slide-17
SLIDE 17

1 May 1999 www.cs.cornell.edu/talc 17

Structs Structs

  • Goals:
  • Prevent reading uninitialized fields
  • Permit flexible scheduling of initialization
  • MALLOC “instruction”

returns uninitialized record

  • Type of struct tracks initialization of fields
  • Example:

{ecx: int} MALLOC eax,8 [int,int] ; eax : ^*[intu, intu] mov [eax+0], ecx ; eax : ^*[intrw,intu] mov ecx, [eax+4] ; type error!

slide-18
SLIDE 18

1 May 1999 www.cs.cornell.edu/talc 18

Much, much more Much, much more

  • Arrays (see next slide)
  • Tagged Unions
  • Displays, Exceptions [TIC’98]
  • Static Data
  • Modules and Interfaces [POPL’99]
  • Run-time code generation

[PEPM’99 Jim, Hornof]

slide-19
SLIDE 19

1 May 1999 www.cs.cornell.edu/talc 19

Mis Mis-

  • features

features

  • MALLOC and garbage collection in trusted

computing base [POPL’99]

  • No way to express aliasing
  • No array bounds check elimination [Walker]
  • Object/class support too primitive [Glew]
slide-20
SLIDE 20

1 May 1999 www.cs.cornell.edu/talc 20

Summary and Conclusions Summary and Conclusions

  • We can type real machine code

Potential for performance + flexibility + safety

  • Challenge:

Finding generally useful abstractions

  • Lots of work remains

http://www.cs.cornell.edu/talc