SAFE SOFTWARE UPDATES VIA MULTI-VERSION EXECUTION PETR HOSEK - - PowerPoint PPT Presentation

safe software updates via multi version execution
SMART_READER_LITE
LIVE PREVIEW

SAFE SOFTWARE UPDATES VIA MULTI-VERSION EXECUTION PETR HOSEK - - PowerPoint PPT Presentation

SAFE SOFTWARE UPDATES VIA MULTI-VERSION EXECUTION PETR HOSEK CRISTIAN CADAR Petr Hosek is a recipient of the Google European Fellowship in Software Engineering and this research is supported in part by this Google Fellowship 2009 2010 10 11


slide-1
SLIDE 1

SAFE SOFTWARE UPDATES VIA MULTI-VERSION EXECUTION

PETR HOSEK CRISTIAN CADAR

Petr Hosek is a recipient of the Google European Fellowship in Software Engineering and this research is supported in part by this Google Fellowship

slide-2
SLIDE 2

2

2009 2010

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-3
SLIDE 3

2

2009 2010

for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

HTTP ETag hash value computation in etag_mutate

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-4
SLIDE 4

2

2009 2010

HTTP ETag hash value computation in etag_mutate

for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-5
SLIDE 5

2

2009 2010

HTTP ETag hash value computation in etag_mutate

for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Bug diagnosed in issue tracker

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-6
SLIDE 6

2

2009 2010

HTTP ETag hash value computation in etag_mutate

for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

etag_mutate(con->physical.etag, srv->tmp_buf);

File (re)compression in mod_compress_physical Bug diagnosed in issue tracker

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-7
SLIDE 7

if (use_etag) { etag_mutate(con->physical.etag, srv->tmp_buf); } 2

2009 2010

HTTP ETag hash value computation in etag_mutate

for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

File (re)compression in mod_compress_physical Bug diagnosed in issue tracker

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10

slide-8
SLIDE 8

if (use_etag) { etag_mutate(con->physical.etag, srv->tmp_buf); } 2

2009 2010

HTTP ETag hash value computation in etag_mutate

for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

File (re)compression in mod_compress_physical Bug diagnosed in issue tracker

01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 12 11 10 04 05 06 07 08 09 10 11 12 01 02 03 04

slide-9
SLIDE 9

Introducing novel approach for improving software updates: Multi-version execution based approach Relying on abundance of resources to improve reliability Run the new version in parallel with the existing one Synchronise the execution of the versions Use output of correctly executing version

3

A year ago in a city far far away...

slide-10
SLIDE 10

4

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

slide-11
SLIDE 11

4

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Synchronisation

Compare individual system calls and their arguments

slide-12
SLIDE 12

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Synchronisation

Compare individual system calls and their arguments

slide-13
SLIDE 13

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

slide-14
SLIDE 14

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

slide-15
SLIDE 15

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Crash

Segmentation fault

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

slide-16
SLIDE 16

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Crash

Segmentation fault

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

Failure recovery

Restart the snapshot and replace the code with the code of the new version

slide-17
SLIDE 17

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Crash

Segmentation fault

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

Failure recovery

Restart the snapshot and replace the code with the code of the new version

Reconvergence

Return to the

  • riginal code and

continue execution

slide-18
SLIDE 18

4

GET /index.html HTTP/1.1 Host: srg.doc.ic.ac.uk Accept-Encoding: gzip for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used - 1; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]); for (h = 0, i = 0; i < etag->used; ++i) h = (h << 5) ^ (h >> 27) ^ (etag->ptr[i]);

Synchronisation and fail-recovery mechanism LIGHTTPD 1.4.22 LIGHTTPD 1.4.23

Crash

Segmentation fault

Synchronisation

Compare individual system calls and their arguments

Checkpointing

Use clone to take a snapshot of a process

Failure recovery

Restart the snapshot and replace the code with the code of the new version

Reconvergence

Return to the

  • riginal code and

continue execution

slide-19
SLIDE 19

Recovery considered successful if versions exhibit the same externally observable behaviour after recovery: Assumes small bug propagation distance Crashes are the only type of observable divergences The non-crashing version used as an oracle If unrecoverable, continue with the non-crashing version

5

Assumptions

slide-20
SLIDE 20

6

Synchronisation possible at multiple levels of abstraction

Total Synchronisation Uncoordinated Execution

slide-21
SLIDE 21

6

System Calls

Synchronisation possible at multiple levels of abstraction

Total Synchronisation Uncoordinated Execution

slide-22
SLIDE 22

6

Function Calls System Calls

Synchronisation possible at multiple levels of abstraction

Total Synchronisation Uncoordinated Execution

slide-23
SLIDE 23

6

Function Calls System Calls Inputs/Outputs

Synchronisation possible at multiple levels of abstraction

Total Synchronisation Uncoordinated Execution

slide-24
SLIDE 24

VERSION 1

void fib(int n) { int f[n+1]; f[1] = f[2] = 1; for (int i = 3; i <= n; ++i) f[i] = f[i-1] + f[i-2]; printf(“%d\n”, f[n]); }

VERSION 2

void fib(int n) { int a = 1, b = 1; for (int i = 3; i <= n; ++i) { int c = a + b; a = b, b = c; } printf(“%d\n”, b); } 7 int main(int argc, char **argv) { fib(5); fib(6); }

Example testing code

System calls define external behaviour

Tested with both implementations

slide-25
SLIDE 25

VERSION 1

void fib(int n) { int f[n+1]; f[1] = f[2] = 1; for (int i = 3; i <= n; ++i) f[i] = f[i-1] + f[i-2]; printf(“%d\n”, f[n]); }

VERSION 2

void fib(int n) { int a = 1, b = 1; for (int i = 3; i <= n; ++i) { int c = a + b; a = b, b = c; } printf(“%d\n”, b); }

VERSION 1

write(1, “5\n”, 2) = 2 write(1, “8\n”, 2) = 2

VERSION 2

write(1, “5\n”, 2) = 2 write(1, “8\n”, 2) = 2 7 int main(int argc, char **argv) { fib(5); fib(6); }

Example testing code

System calls define external behaviour

Snippet of system call trace

Obtained using the strace tool

Snippet of system call trace

Obtained using the strace tool Tested with both implementations

slide-26
SLIDE 26

0.25 0.5 0.75 1 2379 2393 2411 2432 2473 2494 2517 2546 2578 2599 2621 2635 difference (normalised) lighttpd Subversion revision Taken on Linux kernel 2.6.40 and glibc 2.14 using strace tool and custom post-processing (details in the paper)

Measured using lighttpd regression suite on 164 revisions

External behaviour evolves sporadically 95% of revisions introduce no change

8 Traces Source code

slide-27
SLIDE 27

9

Mx architecture

MULTI-VERSION APPLICATION CONVENTIONAL APPLICATION

Mx Execution Environment OPERATING SYSTEM

SYSTEM CALL INTERPOSITION STATIC ANALYSIS RUNTIME MANIPULATION LINUX KERNEL

slide-28
SLIDE 28

Implementation for x86 and x86-64 Linux

Combines binary static analysis, lightweight checkpointing and runtime code patching Completely transparent, runs on unmodified binaries Runs two versions with small differences in behaviour Focus on application crashes and recovery

10

slide-29
SLIDE 29

Multi-eXecution Monitor

Execute and monitor multi-version applications: Intercepting system calls (via ptrace interface) Semantically comparing system calls arguments Environment virtualisation (e.g. files and sockets)

11 MULTI-VERSION APPLICATION

Mx Execution Environment

SYSTEM CALL INTERPOSITION STATIC ANALYSIS RUNTIME MANIPULATION

slide-30
SLIDE 30

Runtime Execution Manipulator

Runtime code patching and fault recovery: OS-level checkpointing (using clone syscall) Runtime stack rewriting (libunwind) Breakpoint insertion and handling

12 MULTI-VERSION APPLICATION

Mx Execution Environment

SYSTEM CALL INTERPOSITION STATIC ANALYSIS RUNTIME MANIPULATION

slide-31
SLIDE 31

Static Executable Analyser

Create various mappings between the two version binaries: Extracting function symbols from binaries (libbfd) Machine code disassembling and analysis (libopcodes) Binary call graph reconstruction and matching

13 MULTI-VERSION APPLICATION

Mx Execution Environment

SYSTEM CALL INTERPOSITION STATIC ANALYSIS RUNTIME MANIPULATION

slide-32
SLIDE 32

%rsp 0xdeadbf64

VE VERSIO RSION 2

0xdeadbef3 <foo>: 0xdeadbef3 <foo>: 0xdeadbef3 <foo>: f5e: callq 0xdeadcaff <bar> 0xdeadcaff <bar>: 0xdeadcaff <bar>: 0xdeadcaff <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax 14

VE VERSIO RSION 1

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax %rsp 0xdeadbf5e

Snippet of instruction code Snippet of instruction code Execution stack Execution stack

slide-33
SLIDE 33

%rsp 0xdeadbf64 14

VE VERSIO RSION 1

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax %rsp 0xdeadbf5e

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

Snippet of instruction code Snippet of instruction code Execution stack Execution stack

slide-34
SLIDE 34

14

VE VERSIO RSION 1

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax %rsp 0xdeadbf5e %rsp 0xdeadbf5e

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

Snippet of instruction code Snippet of instruction code Execution stack Execution stack

slide-35
SLIDE 35

14

VE VERSIO RSION 1

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax %rsp 0xdeadbf5e %rsp 0xdeadbf5e

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

Snippet of instruction code Snippet of instruction code Execution stack Execution stack

slide-36
SLIDE 36

14

VE VERSIO RSION 1

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax %rsp 0xdeadbf5e %rsp 0xdeadbf5e

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

VERSIO SION 2’

0xdeadbeef <foo>: 0xdeadbeef <foo>: 0xdeadbeef <foo>: f59: callq 0xdeadcafe <bar> 0xdeadcafe <bar>: 0xdeadcafe <bar>: 0xdeadcafe <bar>: aff: int $3 b07: mov

  • 0x40(%rbp),%rax

b0a: callq *%rax

Snippet of instruction code Snippet of instruction code Execution stack Execution stack

slide-37
SLIDE 37

Suitable for type of changes and applications: Changes which do not affect memory layout

e.g., refactorings, security patches

Applications which provide synchronisation points

e.g., servers structured around the main dispatch loop

Where reliability is more important than performance

e.g., interactive apps, some server scenarios

15

slide-38
SLIDE 38

robj *o = lookupKeyRead(c->db, c->argv[1]); if (o == NULL) { addReplySds(c,sdscatprintf(sdsempty(), "*%d\r\n",c->argc-2)); for (i = 2; i < c->argc; i++) { addReply(c,shared.nullbulk); } return; } else { if (o->type != REDIS_HASH) { addReply(c,shared.wrongtypeerr); return; } } addReplySds(c,sdscatprintf(sdsempty(), "*%d\r\n",c->argc-2)); 16

Redis regression bug #344 introduced during refactoring

In-memory NoSQL database

HMGET command implementation in hmgetCommand function

Survived a number of crash bugs

in several popular server applications

slide-39
SLIDE 39

16

Redis regression bug #344 introduced during refactoring

robj *o, *value;

  • = lookupKeyRead(c->db,c->argv[1]);

if (o != NULL && o->type != REDIS_HASH) { addReply(c,shared.wrongtypeerr); return; } addReplySds(c,sdscatprintf(sdsempty(), "*%d\r\n",c->argc-2)); for (i = 2; i < c->argc; i++) { if (o != NULL && (value = hashGet(o,c->argv[i])) != NULL) { addReplyBulk(c,value); decrRefCount(value); } else { addReply(c,shared.nullbulk); } }

In-memory NoSQL database

HMGET command implementation in hmgetCommand function

Survived a number of crash bugs

in several popular server applications

Missing return statement

slide-40
SLIDE 40

17

UTILITY BUG TIME SPAN

md5sum sha1sum Buffer underflow 1,124 revs. (1 year 7 months) mkdir mkfifo mknod NULL-pointer dereference 2,937 revs. (over 4 years) cut Buffer overflow 1,201 revs. (2 years 3 months)

APPLICATION/ISSUE BUG TIME SPAN

lighttpd #2169 Loop index underflow 87 revs. (2 months 2 days) lighttpd #2140 Off-by-one error 12 revs. (2 months 1 day) redis #344 Missing return statement 27 revs. (6 days)

Interactive applications: Server applications:

slide-41
SLIDE 41

0.5 1 1.5 2

400.PERLBENCH 401.BZIP2 403.GCC 429.MCF 445.GOBMK 456.HMMER 458.SJENG 462.LIBQUANTUM 464.H264REF 471.OMNETPP 473.ASTAR 483.XALANCBMK 410.BWAVES 416.GAMES 433.MILC 434.ZEUSMP 435.GROMACS 436.CACTUSADM 437.LESLIE3D 444.NAMD 447.DEALII 450.SOPLEX 453.POVRAY 454.CALCULIX 459.GEMSFDTD 465.TONTO 470.LBM 481.WRF 482.SPHINX3

execution time (normalised)

17.91% overhead on SPEC CPU2006

Taken on 3.50 GHz Intel Xeon E3 1280 with 16 GB of RAM, Linux kernel 3.1.9

Measured using SPEC CPU2006 1.2

18 Native Mx

  • ver single version despite 2x utilisation cost
slide-42
SLIDE 42

Interactive applications:

19

UTILITY INPUT SIZE OVERHEAD

md5sum sha1sum <1.25MB <100ms (imperceptible) mkdir mkfifo mknod <115 nested directories <100ms (imperceptible) cut <1.10MB <100ms (imperceptible)

APPLICATION SCENARIO OVERHEAD

lighttpd localhost/network 2.60x – 3.49x lighttpd distant networks 1.01x – 1.04x redis localhost/network 3.74x – 16.72x redis distant networks 1.00x – 1.05x

Taken on 3.50 GHz Intel Xeon E3 1280 with 16 GB of RAM, Linux kernel 3.1.9

Measured using redis-benchmark and http_load

Server applications:

Measured using Coreutils 6.10

Taken on 3.50 GHz Intel Xeon E3 1280 with 16 GB of RAM, Linux kernel 3.1.9

slide-43
SLIDE 43

Better performance overhead: System call binary rewriting Tolerance to system call divergences: Event streaming

20

“The New Mx”

slide-44
SLIDE 44

REDIS

read(6, “PING\r\n”, 1024) 21

Snippet of system call trace Snippet of system call trace

slide-45
SLIDE 45

21

MX

ptrace(PTRACE_GETREGS, 7, {...}, NULL) ptrace(PTRACE_SETREGS, 7, {...}, {...}) ptrace(PTRACE_SYSCALL, 7, {...}, NULL) read(8, “PING\r\n”, 1024) ptrace(PTRACE_GETREGS, 7, {...}, NULL) process_vm_writev(7, {?}, 1, {?}, 1, 0) ptrace(PTRACE_SETREGS, 7, {...}, {...}) ptrace(PTRACE_SYSCALL, 7, {...}, NULL)

REDIS

  • -- SIGTRAP ---

getpid()

  • -- SIGTRAP ---

Snippet of system call trace Snippet of system call trace

slide-46
SLIDE 46

VMA VMA

21

MX

ptrace(PTRACE_GETREGS, 7, {...}, NULL) ptrace(PTRACE_SETREGS, 7, {...}, {...}) ptrace(PTRACE_SYSCALL, 7, {...}, NULL) read(8, “PING\r\n”, 1024) ptrace(PTRACE_GETREGS, 7, {...}, NULL) process_vm_writev(7, {?}, 1, {?}, 1, 0) ptrace(PTRACE_SETREGS, 7, {...}, {...}) ptrace(PTRACE_SYSCALL, 7, {...}, NULL)

REDIS

  • -- SIGTRAP ---

getpid()

  • -- SIGTRAP ---

Snippet of system call trace Snippet of system call trace

slide-47
SLIDE 47

VMA

22

GLIB GLIBC

0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 2a: mov $0x0,%eax 2f: syscall

REDI REDIS

0x4050f0 <anetRead>: 0x4050f0 <anetRead>: 0x4050f0 <anetRead>: 405130: callq <read@plt>

Snippet of instruction code

slide-48
SLIDE 48

VMA

22

GLIB GLIBC

0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 2a: mov $0x0,%eax 2f: syscall

REDI REDIS

0x4050f0 <anetRead>: 0x4050f0 <anetRead>: 0x4050f0 <anetRead>: 405130: callq <read@plt>

NX NX

0x13cd0 <syscall_enter>: 0x13cd0 <syscall_enter>: 0x13cd0 <syscall_enter>: 13d31: cmp $0x1,%r10 13d3a: callq *%r10

GLIB GLIBC

0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 0xdeadbeef <__libc_read>: 2a: jmpq $0x13cd0

Snippet of instruction code

slide-49
SLIDE 49

23

System call synchronisation possible at different phases

slide-50
SLIDE 50

23

System call synchronisation possible at different phases

Every system call

Mx

slide-51
SLIDE 51

23

Record-replay

System call synchronisation possible at different phases

Every system call

Mx

slide-52
SLIDE 52

23

Record-replay

System call synchronisation possible at different phases

Every system call

Mx

Event streaming

Nx

slide-53
SLIDE 53

APPLICATION LEADER APPLICATION FOLLOWER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 En-3

En-2 En-1 En

Event log

slide-54
SLIDE 54

APPLICATION LEADER APPLICATION FOLLOWER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 APPLICATION FOLLOWER En-3

En-2 En-1 En

Event log

slide-55
SLIDE 55

APPLICATION LEADER APPLICATION FOLLOWER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 APPLICATION FOLLOWER APPLICATION FOLLOWER En-3

En-2 En-1 En

Event log

slide-56
SLIDE 56

APPLICATION FOLLOWER APPLICATION LEADER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 APPLICATION FOLLOWER APPLICATION FOLLOWER En-3

En-2 En-1 En

Event log

slide-57
SLIDE 57

APPLICATION LEADER APPLICATION LEADER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 APPLICATION FOLLOWER APPLICATION FOLLOWER

En En-2 En-1

En-3

Event log

slide-58
SLIDE 58

APPLICATION LEADER APPLICATION LEADER

En-14 En-13 En-12 En-11

En-10 En-9 En-8 En-7 En-6 En-5 En-4 24 APPLICATION FOLLOWER APPLICATION FOLLOWER

En En-2 En-1

En-3

Event log

slide-59
SLIDE 59

Support for more complex code changes: Data structure inference & excavation Control flow graph isomorphisms Call stack reconstruction Support for non-crashing type of divergences: Infinite loops and deadlocks

25

Future Work

slide-60
SLIDE 60

Novel approach for improving software updates: Based on multi-version execution Mx can survive crash bugs in real apps Many opportunities for future work: Better performance overhead Tolerance to system call divergencies Support for more complex code changes Support for non-crashing type of divergences

26

Summary

slide-61
SLIDE 61

Distinct code bases, manually-generated N-version programming: A fault-tolerance approach to reliability of software operation Chen, L., and Avizienis, A. FTCS’78 Using replicated execution for a more secure and reliable web browser Xue, H., Dautenhahn, N., and King, S. T. NDSS’12 Variants of the same code, automatically generated N-variant systems: a secretless framework for security through diversity Cox, B., Evans, D., Filipi, A., Rowanhill, J., Hu, W., Davidson, J., Knight, J., Nguyen-Tuong, A., and Hiser, J. USENIX Security’06 Run-time defense against code injection attacks using replicated execution Salamat, B., Jackson, T., Wagner, G., Wimmer, C., and Franz, M. IEEE Transactions 2011 Online validation of different manually-evolved versions Efficient online validation with delta execution Tucek, J., Xiong, W., Zhou, Y. ASPLOS’09 Tachyon: Tandem Execution for Efficient Live Patch Testing Maurer, M., Brumley, D. USENIX Security’12

27