TERN: Stable Deterministic Multithreading through Schedule - PowerPoint PPT Presentation

TERN: Stable Deterministic Multithreading through Schedule Memoization Heming Cui Jingyue Wu Chia-che Tsai Junfeng Yang Computer Science Columbia University New York, NY, USA 1

Nondeterministic Execution • Same input  many schedules • Problem: different runs may show different behaviors, even on the same inputs nondeterministic bug 1  many 2

Deterministic Multhreading (DMT) • Same input  same schedule [DMP ASPLOS '09], [KENDO ASPLOS '09], [COREDET ASPLOS '10], [dOS OSDI '10] – • Problem: minor input change  very different schedule nondeterministic existing DMT systems bug bug 1  many 1  1 Confirmed in experiments 3

Schedule Memoization • Many inputs  one schedule – Memoize schedules and reuse them on future inputs • S tability : repeat familiar schedules – Big benefit: avoid possible bugs in unknown schedules nondeterministic existing DMT systems schedule memoization bug bug bug 1  many 1  1 many  1 Confirmed in experiments 4

TERN: the First Stable DMT System • Run on Linux as user-space schedulers • To memoize a new schedule – Memoize total order of synch operations as schedule • Race-free ones for determinism [RecPlay TOCS] – Track input constraints required to reuse schedule • symbolic execution [KLEE OSDI '08] • To reuse a schedule – Check input against memoized input constraints – If satisfies, enforce same synchronization order 5

Summary of Results • Evaluated on diverse set of 14 programs – Apache, MySQL, PBZip2, 11 scientific programs – Real and synthetic workloads • Easy to use: < 10 lines for 13 out of 14 • Stable: e.g., 100 schedules to process over 90% of real HTTP trace with 122K requests • Reasonable overhead: < 10% for 9 out of 14 6

Outline • TERN overview • An Example • Evaluation • Conclusion 7

Overview of TERN Compile Time Runtime Input I Program Hit Miss Source Match? I I, Si <Ci, Si> Developer Program Program <C, S> LLVM Compiler <C1, S1> Replayer Memoizer … Instrumentor <Cn, Sn> OS OS Schedule Cache TERN components are shaded 8

Outline • TERN overview • An Example • Evaluation • Conclusion 9

Simplified PBZip2 Code main(int argc, char *argv[]) { int i; int nthread = argv[1]; // read input int nblock = argv[2]; for(i=0; i<nthread; ++i) // create worker threads pthread_create(worker); for(i=0; i<nblock; ++i) { // read i'th file block block = bread(i,argv[3]); add(worklist, block); // add block to work list } } worker() { // worker thread code for(;;) { block = get(worklist); // get a block from work list compress(block); // compress block } } 10

Annotating Source main(int argc, char *argv[]) { int i; int nthread = argv[1]; int nblock = argv[2]; symbolic(&nthread); // marking inputs affecting schedule for(i=0; i<nthread; ++i) // TERN intercepts pthread_create(worker); symbolic(&nblock); // marking inputs affecting schedule for(i=0; i<nblock; ++i) { block = bread(i,argv[3]); add(worklist, block); // TERN intercepts } } worker() { for(;;) { block = get(worklist); // TERN intercepts compress(block); } } 11 // TERN tolerates inaccuracy in annotations.

Memoizing Schedules cmd$ pbzip2 2 2 foo.txt main(int argc, char *argv[]) { int i; T1 int nthread = argv[1]; // 2 Synchronization order int nblock = argv[2]; // 2 T1 T2 T3 T1 symbolic(&nthread); p…create T1 p…create for(i=0; i<nthread; ++i) T1 add pthread_create(worker); get T1 symbolic(&nblock); add T1 for(i=0; i<nblock; ++i) { get block = bread(i,argv[3]); add(worklist, block); T1 Constraints T1 } 0 < nthread ? true } 1 < nthread ? true worker() { for(;;) { 2 < nthread ? false T2 T3 block = get(worklist); 0 < nblock ? true T2 T3 compress(block); 1 < nblock ? true } 2 < nblock ? false } 12

Simplifying Constraints cmd$ pbzip2 2 2 foo.txt main(int argc, char *argv[]) { int i; int nthread = argv[1]; Synchronization order int nblock = argv[2]; T1 T2 T3 symbolic(&nthread); p…create p…create for(i=0; i<nthread; ++i) add pthread_create(worker); get symbolic(&nblock); add for(i=0; i<nblock; ++i) { get block = bread(i,argv[3]); add(worklist, block); Constraints } 2 == nthread } 2 == nblock worker() { for(;;) { Constraint block = get(worklist); simplification compress(block); techniques in paper } } 13

Reusing Schedules cmd$ pbzip2 2 2 bar.txt main(int argc, char *argv[]) { int i; int nthread = argv[1]; // 2 Synchronization order int nblock = argv[2]; // 2 T1 T2 T3 symbolic(&nthread); p…create p…create for(i=0; i<nthread; ++i) add pthread_create(worker); get symbolic(&nblock); add for(i=0; i<nblock; ++i) { get block = bread(i,argv[3]); add(worklist, block); Constraints } 2 == nthread } 2 == nblock worker() { for(;;) { block = get(worklist); compress(block); } } 14

Outline • TERN Overview • An Example • Evaluation • Conclusion 15

Stability Experiment Setup • Program – Workload – Apache-CS : 4-day Columbia CS web trace, 122K – MySql-SysBench-simple : 200K random select queries – MySql-SysBench-tx : 200K random select, update, insert, and delete queries – PBZip2-usr : random 10,000 files from “/usr” • Machine: typical 2.66GHz quad-core Intel • Methodology – Memoize schedules on random 1% to 3% of workload – Measure reuse rates on entire workload ( Many  1 ) • Reuse rate: % of inputs processed with memoized schedules 16

How Often Can TERN Reuse Schedules? Program-Workload Reuse Rate (%) # Schedules Apache-CS 90.3 100 MySQL-SysBench-Simple 94.0 50 MySQL-SysBench-tx 44.2 109 PBZip2-usr 96.2 90 • Over 90% reuse rate for three • Relatively lower reuse rate for MySql- SysBench-tx due to random query types and parameters 17

Bug Stability Experiment Setup • Bug stability: when input varies slightly, do bugs occur in one run but disappear in another? • Compared against COREDET [ASPLOS’10] – Open-source, software-only – Typical DMT algorithms (one used in dOS) • Buggy programs: fft, lu, and barnes (SPLASH2) – Global variables are printed before assigned correct value • Methodology: v ary thread count and computation amount, then record bug occurrence over 100 runs for COREDET and TERN 18

Is Buggy Behavior Stable? (fft) COREDET TERN # of threads : no bug 2 : bug occurred 4 8 10 12 14 10 12 14 Matrix size COREDET: 9 schedules, one for each cell. TERN: only 3 schedules, one for each thread count. Fewer schedules  lower chance to hit bug  more stable Similar results for 2 to 64 threads, 2 to 20 matrix size, and the other two buggy programs lu and barnes 19

Does TERN Incur High Overhead in reuse runs? 20 Smaller is better. Negative values mean speed up.

Conclusion and Future Work • Schedule memoization: reuse schedules across different inputs ( Many  1 ) • TERN: easy to use, stable, deterministic, and fast • Future work – Fast & Deterministic Replay/Replication 21

TERN: Stable Deterministic Multithreading through Schedule - PowerPoint PPT Presentation

TERN: Stable Deterministic Multithreading through Schedule Memoization Heming Cui Jingyue Wu Chia-che Tsai Junfeng Yang Computer Science Columbia University New York, NY, USA 1 Nondeterministic Execution Same input many schedules

MULTITHREADING ON IOS AGENDA Multithreading Basics Interlude: Closures Multithreading on iOS

Simultaneous Multithreading: Simultaneous Multithreading: Multiplying Alpha Performance

Multithreading Recursion Checkout Multithreading and Recursion project from SVN Joe Armstrong,

Multithreading Checkout Multithreading project from SVN Joe Armstrong, Programming in

Multithreading Basics thread state: runnable, blocked Multithreading start, sleep,

Multithreading Horstmann ch.9 Multithreading Threads Thread states Thread

Stable Marriage Problem Stable Marriage Problem Small town with n boys and n girls. Stable

Training Deterministic Parsers with Non-Deterministic Oracles by Yoav Goldberg and Joakim

Marek Olszew ski Saman Amarasinghe MIT CSAIL Commit Group Deterministic Multithreading [

$US 50-60 oil prices stable at $US 50-60 construction slows oil prices stable at job market

Multithreading programming Jan Faigl Department of Computer Science Faculty of Electrical

Register Relocation Flexible Contexts for Multithreading Carl A. Waldspurger William E. Weihl

Lecture 10: Multithreading and Condition Variables The Dining Philosophers Problem This is a

Deterministic Networking Lab Part Bernhard Frmel Institut fr Technische Informatik

From normal to anomalous deterministic diffusion Part 1: Normal deterministic diffusion Rainer

NAPE APE TH AN 37 37 TH ANNU NUAL AL IN INTE TERN RNATION TIONAL AL CONFERENC CO

Wrap Up We talked about Filters Edges Corners

Syllabus & Algorithms 15-110 Monday 08/31 Learning Objectives Understand the

Relationship Ingredients Bible Share Eat Eat Prayer Exercise rcise Breathe athe 1 John

Transactional data MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician What

Inflation and Real Rates (Welch, Chapter 05) Ivo Welch Maintained Assumptions Perfect Markets

Administrative notes Proposal resubmissions are graded, and feedback sent. If you

Interactive Planning-based Cognitive Assistance on the Edge Zhiming Hu , Maayan Shvo, Allan Jepson

Algorithm Design An algorithm can be written out in pseudo code Then turned into source code

TERN: Stable Deterministic Multithreading through Schedule - PowerPoint PPT Presentation

TERN: Stable Deterministic Multithreading through Schedule Memoization Heming Cui Jingyue Wu Chia-che Tsai Junfeng Yang Computer Science Columbia University New York, NY, USA 1 Nondeterministic Execution Same input many schedules

MULTITHREADING ON IOS AGENDA Multithreading Basics Interlude: Closures Multithreading on iOS

Simultaneous Multithreading: Simultaneous Multithreading: Multiplying Alpha Performance

Multithreading Recursion Checkout Multithreading and Recursion project from SVN Joe Armstrong,

Multithreading Checkout Multithreading project from SVN Joe Armstrong, Programming in

Multithreading Basics thread state: runnable, blocked Multithreading start, sleep,

Multithreading Horstmann ch.9 Multithreading Threads Thread states Thread

Stable Marriage Problem Stable Marriage Problem Small town with n boys and n girls. Stable

Training Deterministic Parsers with Non-Deterministic Oracles by Yoav Goldberg and Joakim

Marek Olszew ski Saman Amarasinghe MIT CSAIL Commit Group Deterministic Multithreading [

$US 50-60 oil prices stable at $US 50-60 construction slows oil prices stable at job market

Multithreading programming Jan Faigl Department of Computer Science Faculty of Electrical

Register Relocation Flexible Contexts for Multithreading Carl A. Waldspurger William E. Weihl

Lecture 10: Multithreading and Condition Variables The Dining Philosophers Problem This is a

Deterministic Networking Lab Part Bernhard Frmel Institut fr Technische Informatik

From normal to anomalous deterministic diffusion Part 1: Normal deterministic diffusion Rainer

NAPE APE TH AN 37 37 TH ANNU NUAL AL IN INTE TERN RNATION TIONAL AL CONFERENC CO

Wrap Up We talked about Filters Edges Corners

Syllabus &amp; Algorithms 15-110 Monday 08/31 Learning Objectives Understand the

Relationship Ingredients Bible Share Eat Eat Prayer Exercise rcise Breathe athe 1 John

Transactional data MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician What

Inflation and Real Rates (Welch, Chapter 05) Ivo Welch Maintained Assumptions Perfect Markets

Administrative notes Proposal resubmissions are graded, and feedback sent. If you

Interactive Planning-based Cognitive Assistance on the Edge Zhiming Hu , Maayan Shvo, Allan Jepson

Algorithm Design An algorithm can be written out in pseudo code Then turned into source code

Syllabus & Algorithms 15-110 Monday 08/31 Learning Objectives Understand the