Practical and Effective Sandboxing for Non-root users Taesoo Kim - - PowerPoint PPT Presentation

practical and effective sandboxing for non root users
SMART_READER_LITE
LIVE PREVIEW

Practical and Effective Sandboxing for Non-root users Taesoo Kim - - PowerPoint PPT Presentation

Practical and Effective Sandboxing for Non-root users Taesoo Kim and Nickolai Zeldovich MIT CSAIL Why yet another sandbox for desktop applications? There are many existing sandbox mechanisms Chroot / Lxc (Unix/Linux) Jail (Freebsd)


slide-1
SLIDE 1

Practical and Effective Sandboxing for Non-root users

Taesoo Kim and Nickolai Zeldovich MIT CSAIL

slide-2
SLIDE 2

2

Why yet another sandbox for desktop applications?

  • There are many existing sandbox mechanisms

– Chroot / Lxc (Unix/Linux) – Jail (Freebsd) – Seatbelt (Mac OS X) – VM?

...

  • Difficult-to-use, requiring root privilege, or slow!
slide-3
SLIDE 3

3

Unknown binary downloaded from the Internet Our tool

TL;DR

$ mbox -- ./downloaded-bin ... Network Summary: > [11279] -> 173.194.43.51:80 > [11279] Create socket(PF_INET,...) > [11279] -> a00::2607:f8b0:4006:803:0 ... Sandbox Root: > /tmp/sandbox-11275 > N:/tmp/index.html [c]ommit, [d]iff, [i]gnore, [l]ist, [s]hell, [q]uit ?>

slide-4
SLIDE 4

4

Where to connect?

TL;DR

$ mbox -- ./downloaded-bin ... Network Summary: > [11279] -> 173.194.43.51:80 > [11279] Create socket(PF_INET,...) > [11279] -> a00::2607:f8b0:4006:803:0 ... Sandbox Root: > /tmp/sandbox-11275 > N:/tmp/index.html [c]ommit, [d]iff, [i]gnore, [l]ist, [s]hell, [q]uit ?>

slide-5
SLIDE 5

5

Protecting the host filesystem from modification

TL;DR

$ mbox -- ./downloaded-bin ... Network Summary: > [11279] -> 173.194.43.51:80 > [11279] Create socket(PF_INET,...) > [11279] -> a00::2607:f8b0:4006:803:0 ... Sandbox Root: > /tmp/sandbox-11275 > N:/tmp/index.html [c]ommit, [d]iff, [i]gnore, [l]ist, [s]hell, [q]uit ?>

slide-6
SLIDE 6

6

Revision-control-system like interface

TL;DR

$ mbox -- ./downloaded-bin ... Network Summary: > [11279] -> 173.194.43.51:80 > [11279] Create socket(PF_INET,...) > [11279] -> a00::2607:f8b0:4006:803:0 ... Sandbox Root: > /tmp/sandbox-11275 > N:/tmp/index.html [c]ommit, [d]iff, [i]gnore, [l]ist, [s]hell, [q]uit ?>

slide-7
SLIDE 7

7

TL;DR

$ mbox -- ./downloaded-bin ... Network Summary: > [11279] -> 173.194.43.51:80 > [11279] Create socket(PF_INET,...) > [11279] -> a00::2607:f8b0:4006:803:0 ... Sandbox Root: > /tmp/sandbox-11275 > N:/tmp/index.html [c]ommit, [d]iff, [i]gnore, [l]ist, [s]hell, [q]uit ?>

Without root privilege!

slide-8
SLIDE 8

8

Design overview

  • Layered sandbox filesystem

– Overlaying the host filesystem – Confining modification made by sandboxed processes – Persistent storage: in fact, just a regular directory

  • System call interposition

– Commodity OSes provide one for non-root users – Enabling a variety of applications: installing pkgs,

restricting network, build/dev. env ...

slide-9
SLIDE 9

9

Design overview

  • Layered sandbox filesystem

– Overlaying the host filesystem – Confining modification made by sandboxed processes – Persistent storage: in fact, just a regular directory

  • System call interposition

– Commodity OSes provide one for non-root users – Enabling a variety of applications: installing pkgs,

restricting network, build/dev. env ...

slide-10
SLIDE 10

10

Installing packages as normal user

  • Mbox provides a writable sandbox layer on top of the

host filesystem

– User owns the sandbox directory – Contain newly installed files, and package databases

  • Mbox emulates a fakeroot environment

– Use standard package managers without modification – Support: apt-get (Ubuntu), dpkg (Debian), pip (Python) $ mbox -R -- apt-get install git (-R: emulate a fakeroot environment)

slide-11
SLIDE 11

11

Running unknown binary safely

  • Mbox protects the host filesystem from modifications
  • Mbox restricts or monitors network accesses

– Interpret socket-like system calls – Summarize network activity when terminated $ mbox -n -- ./downloaded-bin (-n: disable remote network accesses)

slide-12
SLIDE 12

12

Checkpointing filesystem

$ mbox -i -- emacs ~/.emacs (-i: enable interactive commit-mode) Host Filesystem ~/.emacs

slide-13
SLIDE 13

13

Checkpointing filesystem

$ mbox -i -- emacs ~/.emacs (-i: enable interactive commit-mode) Host Filesystem Edit .emacs ~/.emacs Sandbox FS ~/.emacs Sandbox Write Read

slide-14
SLIDE 14

14

Checkpointing filesystem

$ mbox -i -- emacs ~/.emacs (-i: enable interactive commit-mode) Host Filesystem Edit .emacs ~/.emacs Sandbox FS ~/.emacs Sandbox Write Read Read

slide-15
SLIDE 15

15

Checkpointing filesystem

$ mbox -i -- emacs ~/.emacs (-i: enable interactive commit-mode) Host Filesystem Edit .emacs ~/.emacs Sandbox FS ~/.emacs Sandbox Write Read Read Commit

slide-16
SLIDE 16

16

Build/development environment

  • Mbox can separate out the generated obj files

– make clean == rm -rf outdir

  • Mbox can also be used for virtual dev. env.

– Install packages with standard package managers $ tree linux-git ... +--mm--mmap.c +-mlock.c $ mbox -r outdir -- make (-r dir: specify a sandbox directory) Host Filesystem *.o Sandbox FS linux-git

slide-17
SLIDE 17

17

Outline

  • Motivation / use cases
  • Layered sandbox filesystem
  • System call interposition (using seccomp/BPF)
  • Implementation / evaluation
  • Related work
  • Summary
slide-18
SLIDE 18

18

Sandbox filesystem supports copy-on-write

Sandboxed process Host filesystem Sandbox filesystem .emacs

slide-19
SLIDE 19

19

Sandbox filesystem supports copy-on-write

Sandboxed process

  • pen(“.emacs”, R)

Host filesystem Sandbox filesystem .emacs Read

slide-20
SLIDE 20

20

Sandbox filesystem supports copy-on-write

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem .emacs

slide-21
SLIDE 21

21

Sandbox filesystem supports copy-on-write

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem .emacs .emacs

Copy

slide-22
SLIDE 22

22

Sandbox filesystem supports copy-on-write

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem .emacs Read/Write .emacs

Copy

slide-23
SLIDE 23

23

Copy-on-write by rewriting path arguments

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem .emacs Read/Write

/tmp/sbox/

.emacs

Copy

slide-24
SLIDE 24

24

Copy-on-write by rewriting path arguments

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem .emacs Read/Write

/tmp/sbox/ /tmp/sbox/home/taesoo/.emacs

.emacs

Copy

slide-25
SLIDE 25

25

All subsequent read/write should happen on the sandbox filesystem

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem Read

  • pen(“.emacs”, R)

... .emacs .emacs

slide-26
SLIDE 26

26 /tmp/sbox/home/taesoo/.emacs

All subsequent read/write should happen on the sandbox filesystem

Sandboxed process

  • pen(“.emacs”, RW)

Host filesystem Sandbox filesystem Read

  • pen(“.emacs”, R)

...

/tmp/sbox/

.emacs .emacs

slide-27
SLIDE 27

27

Sandbox filesystem keeps track of deleted files

Sandboxed process

unlink(“.emacs”)

Host filesystem Sandbox filesystem .emacs ... Hashmap of deleted files .emacs ... Mbox

slide-28
SLIDE 28

28

Sandbox filesystem keeps track of deleted files

Sandboxed process

unlink(“.emacs”)

Host filesystem Sandbox filesystem .emacs Read .emacs ...

/tmp/sbox/home/taesoo/.emacs /tmp/sbox/

  • pen(“.emacs”, R)

deleted Hashmap of deleted files .emacs ... Mbox

slide-29
SLIDE 29

29

Mbox doesn't have to interpose on every system call

read(fd, buf, size) write(fd, buf, size) fd = open(“.emacs”, R) fd = open(“.emacs”, RW)

  • After redirecting the path in open(), we don't have to

interpose on read/write() system calls

  • Mbox needs to interpose on 48 system calls getting a path

argument to provide a layered sandbox filesystem

slide-30
SLIDE 30

30

Mechanism: system call interposition

  • Ptrace is a common technique, but slow

– Interpose entry/exit of every system call – Serialize system calls of child processes

  • Using seccomp/BPF (>= Linux 3.5)

– Seccomp is a security mechanism for isolating a

process by allowing a certain set of system calls

– Seccomp/BPF uses BPF (Berkeley Packet Filter) to

specify rules for filtering system calls

slide-31
SLIDE 31

31

BPF program for interposition

Mbox Kernel User space

slide-32
SLIDE 32

32

BPF program for interposition

Mbox Kernel User space BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl()

Seccomp/BPF

slide-33
SLIDE 33

33

BPF program for interposition

Mbox Kernel User space BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl()

Seccomp/BPF

slide-34
SLIDE 34

34

BPF program for interposition

Mbox Kernel User space Sandboxed process exec() BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl()

① ②

Seccomp/BPF

slide-35
SLIDE 35

35

BPF program for interposition

Mbox

  • pen(“/a", RW)

Kernel User space Sandboxed process exec() BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl()

① ② ③

Seccomp/BPF

slide-36
SLIDE 36

36

BPF program for interposition

Mbox

  • pen(“/a", RW)

Kernel User space Sandboxed process exec() BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl() EVENT_SECCOMP wait()

① ② ③ ④

Seccomp/BPF

slide-37
SLIDE 37

37

BPF program for interposition

Mbox

  • pen(“/a", RW)

Kernel User space ptrace (PEEK/POKE) “/a” “/tmp/sbox/a” → Sandboxed process exec() BPF_STMT(LD, OFF_SYSCALL) BPF_JUMP(#open, 0, 1) BPF_STMT(RET, TRACE) … BPF_STMT(RET, ALLOWED) BPF prctl() EVENT_SECCOMP wait()

① ② ③ ④ ⑤

Seccomp/BPF

slide-38
SLIDE 38

38

More story to come ...

  • How to avoid time-of-check-to-time-of-use?
  • How to avoid replicating OS state?
  • ...

Please, check the paper!

slide-39
SLIDE 39

39

Implementation

  • Mbox: a prototype for Linux (>= 3.5, x86-64)

– Using seccomp/BPF and ptrace – Extending strace 4.7 – 1,500 Lines of code – Distributions: Ubuntu 12.04 and Arch 64bit

slide-40
SLIDE 40

40

Performance evaluation

  • How much overhead does Mbox exhibit?
  • How much faster is seccomp/BPF than ptrace?
slide-41
SLIDE 41

41

Benchmark

  • Following the benchmark from Apiary
  • Run each benchmark in three configurations

– Normal – Mbox with ptrace – Mbox with seccomp/BPF

Task Description Octave Octave Benchmark calculating matrix Zip Compress source files of Linux 3.8 Untar Decompress source files of Linux 3.8 Build Linux (-j1) Compile Linux 3.8 kernel

slide-42
SLIDE 42

42

Mbox imposes modest end-to-end performance overhead

Task Normal Mbox Seccomp/BPF Octave 2.1s 2.1s 0.1% Zip 15.6s 17.4s 12.0% Untar 13.6s 16.4s 20.9% Build Linux (-j1) 43.6s 49.7s 13.9%

  • 0.1% ~ 20.9% overhead
  • Octave: a computation-heavy workload

– Exhibits negligible performance overhead (0.1%) – Spends 98% of its execution in userspace

slide-43
SLIDE 43

43

Seccomp/BPF reduces the interposition overhead

  • Compare overheads of using ptrace and seccomp/BPF
  • Seccomp/BPF reduces overhead up to 24.5%

Task Normal Mbox Ptrace Seccomp/BPF Octave 2.1s 2.1s 0.1% 2.1s 0.1% Zip 15.6s 21.2s 36.5% 17.4s 12.0% Untar 13.6s 19.0s 40.3% 16.4s 20.9% Build Linux (-j1) 43.6s 53.2s 21.9% 49.7s 13.9%

slide-44
SLIDE 44

44

Seccomp/BPF has better concurrency than ptrace

  • When compiling the Linux kernel with 4 parallel jobs,

performance improves 64.9% compared to ptrace

  • By avoiding unnecessary serialization of system calls,

multiple processes execute system calls concurrently

Task Normal Mbox Ptrace Seccomp/BPF Build Linux (-j1) 43.6s 53.2s 21.9% 49.7s 13.9% Build Linux (-j4) 21.7s 45.6s 110.1% 31.5s 45.2%

slide-45
SLIDE 45

45

Seccomp/BPF has better concurrency than ptrace

  • When compiling the Linux kernel with 4 parallel jobs,

performance improves 64.9% compared to ptrace

  • By avoiding unnecessary serialization of system calls,

multiple processes execute system calls concurrently

Task Normal Mbox Ptrace Seccomp/BPF Build Linux (-j1) 43.6s 53.2s 21.9% 49.7s 13.9% Build Linux (-j4) 21.7s 45.6s 110.1% 31.5s 45.2%

slide-46
SLIDE 46

46

Related work

  • Layered filesystems: UnionFS [Quigley '06] / Aufs

– Following unification rules / copy-on-write

→ Require no modifications in commodity OSes

  • System call interposition: Ostia [Garfinkel '04]

– Enforcing security policies / studied common pitfalls

→ Summarize our experience of using seccomp/BPF

  • Namespace: Plan9 [Pike '90] / Lxc container (Docker)

– Private namespace for each process

→ Enabling various applications via system call interposition

slide-47
SLIDE 47

47

Summary

Mbox: a lightweight sandboxing mechanism

– Layered sandbox filesystem – Revision-control-system like sandbox usage model – Interposing on system calls with seccomp/BPF – Enabling a variety of applications for non-root users

http://pdos.csail.mit.edu/mbox

slide-48
SLIDE 48

48

Questions (if you don't have any)

  • What if files are modified by other processes running
  • utside of Mbox?
  • Why 20% on tar? just rewriting path arguments doesn't

seem to be demanding work.

  • How complicated the BPF program? Why not implement

everything in BPF then?

  • Why does Mbox support only 64bit? and is Mbox ready

for users (not developers)?

  • Can Mbox be used for A, B and C … ?