deterministic process groups in
play

Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis - PowerPoint PPT Presentation

Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x t := x x := t + 1 x := t + 1 What is x ? x == 2 x == 2 x ==


  1. Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington

  2. A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x t := x x := t + 1 x := t + 1 What is x ? x == 2 x == 2 x == 1 2

  3. Nondeterministic IPC Process 0 send(msg=A) send(msg=B) Process 1 Process 2 recv(..) recv(..) Who gets msg A ? recv(msg=A) recv(msg=A) recv(msg=B) recv(msg=B) 3

  4. Nondeterminism In Real Systems shared-memory disks why nondeterministic : why nondeterministic : multiprocessor hardware is drive latency is unpredictable unpredictable network IPC (e.g. pipes) why nondeterministic : why nondeterministic : packets arrive from multiprocessor hardware is external sources unpredictable posix signals why nondeterministic : . . . unpredictable scheduling , also can be triggered by users 4

  5. The Problem • Nondeterminism makes programs . . . ➡ hard to test ‣ same input, different outputs ➡ hard to debug ‣ leads to heisenbugs ➡ hard to replicate for fault-tolerance ‣ replicas get out of sync • Multiprocessors make this problem much worse! 5

  6. Our Solution • OS support for deterministic execution ➡ of arbitrary programs ➡ attack all sources of nondeterminism ( not just shared-memory ) ➡ even on multiprocessors New OS abstraction: Deterministic Process Group (DPG) Thread 1 Thread 2 Thread 3 Process A Process B deterministic box 6

  7. Key Questions 1 What can be made deterministic? 2 What can we do about the remaining sources of nondeterminism? 7

  8. Key Questions 1 What can be made deterministic? - distinguish internal vs. external nondeterminism 2 What can we do about the remaining sources of nondeterminism? 8

  9. Internal External nondeterminism nondeterminism • arises from scheduling • arises from interactions artifacts (hw timing, etc) with the external world (networks, users, etc) NOT Fundamental Fundamental can be eliminated! can not be eliminated 9

  10. Internal External Determinism Nondeterminism users real time network deterministic box 10

  11. Internal External Determinism Nondeterminism shared users real time memory network Process 1 pipes private files Process 2 a programmer-defined process group Process 3 deterministic box 11

  12. Internal External Determinism Nondeterminism shared users real time memory network Process 1 pipes ? private pipe files Process 2 shared file Process 3 Process 4 deterministic box 12

  13. Internal External Determinism Nondeterminism shared users real time memory shim program network Process 1 pipes private Precisely controls pipe files Process 2 all external inputs shared file • value of input data Process 3 • time input data arrives Process 4 deterministic box 13

  14. Internal External Determinism Nondeterminism users real time network user-space apps An entire virtual machine could go inside the deterministic box! operating system - too inflexible - too costly (virtual machine) deterministic box 14

  15. Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box OS ensures: • internal nondeterminism is eliminated (for shared-memory, pipes, signals, local files, ...) • external nondeterminism funneled through shim program Shim Program: • user-space program that precisely controls all external nondeterministic inputs 15

  16. Contributions Conceptual: - identify internal vs. external nondeterminism - key: internal nondeterminism can be eliminated! Abstraction: - Deterministic Process Groups (DPGs) - control external nondeterminism via a shim program Implementation: - dOS, a modified version of Linux - supports arbitrary, unmodified binaries Applications: - deterministic parallel execution - record/replay - replicated execution 16

  17. Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 17

  18. A Parallel Computation local input parallel program files deterministic box This program executes deterministically! • even on a multiprocessor • supports parallel programs written in any language ‣ no heisenbugs! ‣ test input files , not interleavings 18

  19. A Webserver webserver shim network, etc (many threads/processes) deterministic box Deterministic Record/Replay • implement in shim program • requires no webserver modification Advantages ‣ significantly less to log ( internal nondeterminism is eliminated) ‣ log sizes 1,000x smaller! 19

  20. A Webserver shim webserver deterministic box network, etc shim webserver deterministic box Fault-tolerant Replication • implement replication protocol in shim programs (paxos, virtual synchrony, etc) Advantage ‣ easy to replicate multithreaded servers ( internal nondeterminism is eliminated) 20

  21. A Webserver Using DPGs to construct applications deterministic part nondeterministic part (in a DPG) (in a shim) low-level request network I/O processing (bundle into requests) webserver • behaves deterministically w.r.t. requests rather than packets Shim program defines the nondeterministic interface 21

  22. Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 22

  23. Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box System Interface • New system call creates a new DPG: sys_makedet() ‣ DPG expands to include all child processes • Just like ordinary linux processes ‣ same system calls, signals, and hw instruction set ‣ can be multithreaded 23

  24. Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box Two questions: • What are the semantics of internal determinism? • How do shim programs work? 24

  25. Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box Internal Determinism • OS guarantees internal communication is scheduled deterministically • Conceptually: executes as if serialized onto a logical timeline ‣ implementation is parallel 25

  26. Internal Determinism Logical Thread 1 Thread 2 Timeline wr x t=1 always reads same value of x rd x t=2 read(pipe) t=3 rd y t=4 always blocks for 3 time steps blocking call always returns same data rd z t=5 read(pipe) t=6 wr y t=7 Each DPG has a logical timeline ‣ instructions execute as if serialized onto the logical timeline ‣ internal events are deterministic 26

  27. Internal Determinism Logical Thread 1 Thread 2 Timeline wr x t=1 rd x t=2 arbitrary delays in physical time are possible read(pipe) t=3 rd y t=4 blocking call rd z t=5 read(pipe) t=6 wr y t=7 Physical time is not deterministic ‣ deterministic results , but not deterministic performance 27

  28. External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 read(socket) t=3 rd y t=4 packet blocking call rd z t=5 arrival read(socket) t=6 wr y t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 28

  29. External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 read(socket) t=3 packet blocking call rd y t=4 arrival read(socket) t=5 wr y t=6 rd z t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 29

  30. External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 packet read(socket) t=3 arrival blocking call read(socket) t=4 wr y t=5 rd y t=6 rd z t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 30

  31. External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time wr x t=1 rd x t=2 shim program read(socket) t=3 packet rd y t=4 blocking call arrival rd z t=5 read(socket) t=6 wr y t=7 Two sources of nondeterminism: • data returned by read() ‣ the what • blocking time of read() ‣ the when 31

  32. Shim Example: Read Syscall Logical DPG Shim Timeline Thread Program OS t=2 1 read() t=3 t=4 “hello” return(“hello”) t=10 Shim can either . . . t=11 Monitor call (e.g., for record) 1 Control call (e.g., for replay) 2 32

  33. Shim Example: Read Syscall Logical DPG Shim Timeline Thread Program OS t=2 1 t=3 2 t=4 t=10 “hello” “hello” return(“hello”) t=10 Shim can either . . . t=11 Monitor call (e.g., for record) 1 Control call (e.g., for replay) 2 33

  34. Shim Example: Replication Key idea: We have implemented this idea (see paper) • protocol delivers (time,msg) replication pairs to replicas protocol • ensure replicas see same input at same logical time shim shim shim multithreaded multithreaded multithreaded server server server DPG Replica 3 DPG Replica 1 DPG Replica 2 34

  35. Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend