Dynamically diagnosing type errors in unsafe code Stephen Kell - PowerPoint PPT Presentation

Dynamically diagnosing type errors in unsafe code Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge 1

A definition “... dynamically type-safe [means] the behavior of any program, correct or not, can be easily understood in terms of the source-level language semantics.” 2

A definition “... dynamically type-safe [means] the behavior of any program, correct or not, can be easily understood in terms of the source-level language semantics.” —Ungar, Spitz and Ausch, Constructing a Metacircular Virtual Machine in an Exploratory Programming Environment 2

A definition “... dynamically type-safe [means] the behavior of any program, correct or not, can be easily understood in terms of the source-level language semantics.” —Ungar, Spitz and Ausch, Constructing a Metacircular Virtual Machine in an Exploratory Programming Environment “Type safety” [at run time] is really about debugging ! 2

A definition “... dynamically type-safe [means] the behavior of any program, correct or not, can be easily understood in terms of the source-level language semantics.” —Ungar, Spitz and Ausch, Constructing a Metacircular Virtual Machine in an Exploratory Programming Environment “Type safety” [at run time] is really about debugging ! � clean error reports are better than corrupting errors � ... would be nice even in unsafe languages , like C 2

Tool wanted if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( struct commit ∗ )obj)) return − 1; return 0; } 3

Tool wanted if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( struct commit ∗ )obj)) return − 1; տ ր return 0; CHECK this } (at run time) 3

Tool wanted if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( struct commit ∗ )obj)) return − 1; տ ր return 0; CHECK this } (at run time) But also wanted: � binary-compatible � source-compatible � ... for real, idiomatic code in (say) C � reasonable performance 3

Tool wanted if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( struct commit ∗ )obj)) return − 1; տ ր return 0; CHECK this } (at run time) But also wanted: � binary-compatible � source-compatible � ... for real, idiomatic code in (say) C � reasonable performance Enter libcrunch , which does the above. 3

The user’s-eye view � $ crunchcc -o myprog ... # + other front-ends 4

The user’s-eye view � $ crunchcc -o myprog ... # + other front-ends � $ ./myprog # runs normally 4

The user’s-eye view � $ crunchcc -o myprog ... # + other front-ends � $ ./myprog # runs normally � $ LD PRELOAD=libcrunch.so ./myprog # does checks 4

The user’s-eye view � $ crunchcc -o myprog ... # + other front-ends � $ ./myprog # runs normally � $ LD PRELOAD=libcrunch.so ./myprog # does checks � myprog: Failed is a internal(0x5a1220, 0x413560 a.k.a. "uint$32") at 0x40dade, allocation was a heap block of int$32 originating at 0x40daa1 Reminiscent of Valgrind (Memcheck), but different... � not checking memory definedness, in-boundsness, etc.. � ... in fact, assume correct w.r.t. these! � provide & exploit run-time type information 4

Sketch of the instrumentation for C if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( struct commit ∗ )obj)) return − 1; return 0; } 5

Sketch of the instrumentation for C if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( CHECK ( is a (obj, ”struct commit”)), ( struct commit ∗ )obj))) return − 1; return 0; } 5

Sketch of the instrumentation for C if (obj − > type == OBJ COMMIT) { if (process commit(walker, ( CHECK ( is a (obj, ”struct commit”)), ( struct commit ∗ )obj))) return − 1; return 0; } Need a runtime which is a() function � provides a fast � ... and a few other flavours of check � by efficiently tracking allocations � ... and attaching reified type info 5

Reified, unique data types (see my Onward! 2015 paper about liballocs ) struct ellipse { double maj, min; struct point { double x, y; } ctr ; } ; “int” 4 0 __uniqtype__int “double” 8 0 __uniqtype__double 0 16 2 0 8 __uniqtype__point “ellipse” 32 3 0 8 16 __uniqtype__ellipse ... � also model: stack frames, functions, pointers, arrays, ... � unique → “exact type” test is a pointer comparison is a() is a short search over containment edges � 6

Is it really that simple? What about...? � untyped malloc() et al. � opaque pointers, a.k.a. void* � conversion of pointers to integers and back � function pointers � pointers to pointers � “simulated subtyping” � { custom, nested } heap allocators � alloca() � “sloppy” (non-standard-compliant) code � unions, varargs, memcpy() 7

What data type is being malloc() ’d? Use intraprocedural “sizeofness” analysis size t sz = sizeof (struct Foo); /* ... */ malloc(sz); Sizeofness propagates, a bit like dimensional analysis. 8

What data type is being malloc() ’d? Use intraprocedural “sizeofness” analysis size t sz = sizeof (struct Foo); /* ... */ malloc(sz); Sizeofness propagates, a bit like dimensional analysis. malloc(sizeof (Blah) + n * sizeof (struct Foo)) 8

What data type is being malloc() ’d? Use intraprocedural “sizeofness” analysis size t sz = sizeof (struct Foo); /* ... */ malloc(sz); Sizeofness propagates, a bit like dimensional analysis. malloc(sizeof (Blah) + n * sizeof (struct Foo)) Dump typed allocation sites from compiler, for later pick-up source tree ... main.c widget.c util.c ... main.i widget.i util.i .allocs .allocs .allocs 8

Polymorphism via multiply-indirected void void sort eight special ( void ∗∗ pt) { void ∗ tt [8]; register int i ; for ( i=0;i < 8;i++)tt [ i]=pt[ i ]; for ( i=XUP;i < =TUP;i++) { pt[i]=tt[2 ∗ i]; pt[OPP DIR(i)]=tt[2 ∗ i+1]; } } neighbor = ( int ∗∗ )calloc(NDIRS, sizeof ( int ∗ )); sort eight special (( void ∗∗ ) neighbor ); // < −− must allow! � solution: tolerate casts from T** to void** ... � and check writes through void** � ... against the underlying object type (here int *[] ) 9

Performance data: C-language SPEC CPU2006 benchmarks bench normal/ s crunch % nopreload bzip2 +6 . 8 % +1 . 4 % 4 . 95 gcc 0 . 983 +160 % – % gobmk +11 % +2 . 0 % 14 . 6 h264ref 10 . 1 +3 . 9 % +2 . 9 % hmmer 2 . 16 +8 . 3 % +3 . 7 % lbm 3 . 42 +9 . 6 % +1 . 7 % mcf +12 % ( − 0 . 5 %) 2 . 48 milc 8 . 78 +38 % +5 . 4 % sjeng +1 . 5 % ( − 1 . 3 %) 3 . 33 sphinx3 1 . 60 +13 % +0 . 0 % perlbench 10

Experience on “correct” code run-time false positives benchmark compile fixes instances unique (of which...) total unhelpful bzip2 0 48 3 3 3 × 10 5 gcc 1 14 3 gobmk 0 0 0 0 h264ref 2 27 2 0 hmmer 0 0 0 0 5 × 10 7 lbm 0 8 0 mcf 0 0 0 0 milc 0 0 0 0 sjeng 0 0 0 0 sphinx3 0 0 0 0 11

A “helpful” false positive? typedef double LBM Grid[SIZE Z ∗ SIZE Y ∗ SIZE X ∗ N CELL ENTRIES]; typedef LBM Grid ∗ LBM GridPtr; #define MAGIC CAST(v) (( unsigned int ∗ ) (( void ∗ ) (&(v)))) #define FLAG VAR(v) unsigned int ∗ const aux = MAGIC CAST(v) // ... \ #define TEST FLAG(g,x,y,z,f) (( ∗ MAGIC CAST(GRID ENTRY(g, x, y, z, FLAGS))) & (f)) #define SET FLAG(g,x,y,z,f) \ { FLAG VAR(GRID ENTRY(g, x, y, z, FLAGS)); ( ∗ aux ) | = (f); } 12

Future work: shopping list for a safe implementation of C − ǫ � check memcpy() , realloc() , etc.. � add a bounds checker (improve on SoftBound) � add a GC (precise! improve on Boehm) � check unions and varargs � always initialize pointers � check unsafe writes through char* � safely address-takeable union members (!) Good prospects for all of the above! (ask me) 13

Conclusions Checking pointer casts can be made efficient and helpful � source- and binary-compatible � low overhead, convenient to use (e.g. no rebuilds) � good prospects for extension Code is here: http://github.com/stephenrkell/libcrunch/ Thanks for your attention. Questions? 14

Dynamically diagnosing type errors in unsafe code Stephen Kell - PowerPoint PPT Presentation

Dynamically diagnosing type errors in unsafe code Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge 1 A definition ... dynamically type-safe [means] the behavior of any program, correct or not, can be

Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial infections

The importance of meaning Diagnosing Diagnosing meaning errors meaning errors Detmar Meurers

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Unsafe & safe road way design Unsafe & safe road way design for urban roads for

Diagnosing Type Errors with Class Danfeng Zhang Dimitrios Vytiniotis Simon Peyton-Jones Andrew

Diagnosing the Location Diagnosing the Location of Bogon Bogon Filters Filters of Randy Bush

Baba Inusa Recommendation Lead Consultant, Paediatric Sickle cell and Thalassaemia , GSTT

Review Ch 1-5 Executing code Compile code (convert from C++ to computer code) - Syntax errors

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Preemptive type checking in dynamically typed programs Neville Grech, Julian Rathke, Bernd

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

The programmer's view The programmer's view of a dynamically reconfigurable of a dynamically

Syndrome-Coupled Rate-Compatible Error-Correcting Codes for Flash Memories Pengfei Huang 1 , Yi

Target-Specific Compiler Optimizations Oliver Bringmann RESEARCH ON YOUR BEHALF 1 Outline

Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs

Course Introduction What this course is about Hardware/Software interface: Compilers,

The Performance of -Kernel Based Systems Hermann Hrtig Michael Hohmuth Jochen Liedtke

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and

Priority queues Hash tables chaining Priority queue ADT Binary heap March 13, 2020 Cinda

Managing the Transition from Complexity to Elegance Charles Moore Senior Research Fellow

Sambuz

Useful Links

Newsletter

Mail Us

Dynamically diagnosing type errors in unsafe code Stephen Kell - PowerPoint PPT Presentation

Dynamically diagnosing type errors in unsafe code Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge 1 A definition ... dynamically type-safe [means] the behavior of any program, correct or not, can be

Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial infections

The importance of meaning Diagnosing Diagnosing meaning errors meaning errors Detmar Meurers

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Unsafe &amp; safe road way design Unsafe &amp; safe road way design for urban roads for

Diagnosing Type Errors with Class Danfeng Zhang Dimitrios Vytiniotis Simon Peyton-Jones Andrew

Diagnosing the Location Diagnosing the Location of Bogon Bogon Filters Filters of Randy Bush

Baba Inusa Recommendation Lead Consultant, Paediatric Sickle cell and Thalassaemia , GSTT

Review Ch 1-5 Executing code Compile code (convert from C++ to computer code) - Syntax errors

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Preemptive type checking in dynamically typed programs Neville Grech, Julian Rathke, Bernd

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

The programmer's view The programmer's view of a dynamically reconfigurable of a dynamically

Syndrome-Coupled Rate-Compatible Error-Correcting Codes for Flash Memories Pengfei Huang 1 , Yi

Target-Specific Compiler Optimizations Oliver Bringmann RESEARCH ON YOUR BEHALF 1 Outline

Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs

Course Introduction What this course is about Hardware/Software interface: Compilers,

The Performance of -Kernel Based Systems Hermann Hrtig Michael Hohmuth Jochen Liedtke

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and

Priority queues Hash tables chaining Priority queue ADT Binary heap March 13, 2020 Cinda

Managing the Transition from Complexity to Elegance Charles Moore Senior Research Fellow

Sambuz

Useful Links

Newsletter

Mail Us

Unsafe & safe road way design Unsafe & safe road way design for urban roads for