Technology Folklore Martin Thompson & Dave Farley - PowerPoint PPT Presentation

Technology Folklore Martin Thompson & Dave Farley http://code.google.com/p/disruptor/ http://www.davefarley.net http://mechanical-sympathy.blogspot.com/

Who are we? Disruptor

Sample Folklore: Queues, an efficient way to exchange data Link List backed size Tail Node Node Node Node Head • Hard to limit size • O(n) access times if not head or tail • Generates garbage which can be significant Array backed Tail Head size Cache line • Cannot resize easily • Difficult to get *P *C correct • O(1) access times for any slot and cache friendly

Some Results Sequence Barrier Sequence Barrier Disruptor Sequencer Test Queue Disruptor Factor 2,366,171 72,087,993 30.5 OnePublisherToOneProcessorUniCastThroughputTest 1,590,126 63,358,798 39.8 OnePublisherToThreeProcessorDiamondThroughputTest 191,661 54,165,692 282.6 OnePublisherToThreeProcessorMultiCastThroughputTest 1,289,199 71,562,125 55.5 OnePublisherToThreeProcessorPipelineThroughputTest 2,175,593 10,412,567 4.8 OnePublisherToThreeWorkerPoolThroughputTest

A Question… What is the most successful invention in human history?

The Scientific Method • Char haract acter eriz izat ation ion Make a guess based on experience and observation. • Hy Hypot pothes hesis is Propose an explanation. • Deduct eduction ion Make a prediction from the hypothesis. • Exper xperiment iment Test the prediction.

Stand Back! We’re going to try some science!

Myth – CPU performance has stopped increasing • Characterization: My computer is modern but my code is not noticeably faster. • Hypothesis: We have reached the limits! CPU performance isn’t increasing anymore. • Deduction: If this is the case then an algorithm run on the newest processors will perform at roughly the same rate as on older processors. • Experiment: …

Myth – CPU performance has stopped increasing public class BruteForce { • Characterization: My computer is modern but my code is not noticeably faster. public static List<String> words(String s) { List<String> result = new ArrayList<String>(); • Hypothesis: We have reached the limits! CPU performance isn’t increasing anymore. int i = s.length(); • Deduction: If this is the case then an algorithm run on the newest processors will int lastChar = -1; perform at roughly the same rate as on older processors. while (--i != -1) { • Experiment: … if (lastChar == -1 && s.charAt(i) != ' ') { lastChar = i; } else if (lastChar != -1) { if (s.charAt(i) == ' ' || i == 0) { result.add(s.substring(i + 1, lastChar + 1)); lastChar = -1; } } } return result; } }

Myth – CPU performance has stopped increasing • Characterization: My computer is modern but my code is not noticeably faster. • Hypothesis: We have reached the limits! CPU performance isn’t increasing anymore. • Deduction: If this is the case then an algorithm run on the newest processors will perform at roughly the same rate as on older processors. • Experiment: … Processor Name Model Operations/sec Release Date Intel(R) Core 2 Duo(TM) CPU P8600 @ 2.40GHz 1434 (2006) Intel(R) Xeon(R) CPU E5620 @ 2.40GHz 1768 (2009) Intel(R) Core(TM) CPU i7-2677M @ 1.80GHz 2202 (2010) Intel(R) Core(TM) CPU i7-2720QM @ 2.20GHz 2674 (2010)

Myth – Go Parallel to scale – part I • Characterization: I can do more work by executing tasks in parallel. • Hypothesis: I can increase the rate at which I do work by increasing the number of threads that I do work on. • Deduction: If this is the case then we should be able to measure higher throughput as we add more threads. • Experiment: Let’s increment a 64 bit counter, a simple Java long, 500 million times… Method Time (ms) Single thread 300 Single thread with lock 10,000 Two threads with lock 224,000 Single thread with CAS 5,700 Two threads with CAS 30,000

Myth – Go Parallel to scale – part II • Characterization: I can do more work by executing tasks in parallel. • Hypothesis: I can increase the rate at which I do work by increasing the number of threads that I do work on. • Deduction: If this is the case then we should be able to measure higher throughput as we add more threads. • Experiment: …

Myth – Go Parallel to scale – part II • Characterization: I can do more work by executing tasks in parallel. The Experiment: • Hypothesis: I can increase the rate at which I do work by increasing the number of threads that I do work on. • Deduction: If this is the case then we should be able to measure higher throughput From Guy Steele's talk at the as we add more threads. Strange Loop Conference • Experiment: … (http://www.infoq.com/presentations/Thinking-Parallel-Programming) Tested with copy the text of ‘Alice in Wonderland’

Myth – Go Parallel to scale – part II package strings object WordState { def maybeWord(s:String) = if (s.isEmpty) FastList.empty[String] else FastList(s) def processChar(c:Char): WordState = if (c != ' ') Chunk("" + c) else Segment.empty def processChar2(a: WordState, c:Char): WordState = if (c != ' ') a.assoc(c) else a.assoc(Segment.empty); • Characterization: I can do more work by executing tasks in parallel. def compose(a: WordState, b: WordState) = a.assoc(b) def wordsParallel(s:Array[Char]): FastList[String] = { s.par.aggregate(Chunk.empty)(processChar2, compose).toList() • Hypothesis: I can increase the rate at which I do work by increasing the number of public class BruteForce } { public static List<String> words(String s) def words(s:Array[Char]) : FastList[String] = { threads that I do work on. { val wordStates = s.map(processChar).toArray List<String> result = new ArrayList<String>(); wordStates.foldRight(Chunk.empty)((x, y) => x.assoc(y)).toList() } int i = s.length(); } • Deduction: If this is the case then we should be able to measure higher throughput int lastChar = -1; trait WordState { while (--i != -1) as we add more threads. def assoc(other: WordState): WordState { def assoc(other: Char): WordState if (lastChar == -1 && s.charAt(i) != ' ') def toList(): FastList[String] { } • Experiment: … lastChar = i; } case class Chunk(part: String) extends WordState { else if (lastChar != -1) override def assoc(other: WordState) = { { other match { if (s.charAt(i) == ' ' || i == 0) case c:Chunk => Chunk(part + c.part) { case s:Segment => Segment(part + s.prefix, s.words, s.trailer) result.add(s.substring(i + 1, lastChar + 1)); } lastChar = -1; } } } override def assoc(other: Char) = Chunk(part + other) } override def toList() = WordState.maybeWord(part) return result; } } } object Chunk { val empty:WordState = Chunk("") } case class Segment(prefix: String, words: FastList[String], trailer: String) extends WordState { override def assoc(other: WordState) = { other match { case c:Chunk => Segment(prefix, words, trailer + c.part) case s:Segment => Segment(prefix, words ++ WordState.maybeWord(trailer + s.prefix) ++ s.words, s.trailer) } } override def assoc(other: Char) = Segment(prefix, words, trailer + other) override def toList() = WordState.maybeWord(prefix) ++ words ++ WordState.maybeWord(trailer) } object Segment { val empty:WordState = Segment("", FastList.empty[String], "") }

Myth – Go Parallel to scale – part II • Characterization: I can do more work by executing tasks in parallel. • Hypothesis: I can increase the rate at which I do work by increasing the number of threads that I do work on. • Deduction: If this is the case then we should be able to measure higher throughput as we add more threads. • Experiment: … Lines Test Ops/Sec of Code Scala: Parallel Collections 61 400 Java: Imperative single threaded solution 33 1,600

Myth – Adding a batching algorithm increases latency • Characterization: Adding a batching algorithm increases latency • Hypothesis: Waiting for the batch to fill will always add latency • Deduction: If this is the case then we can never exceed the maximum rate at which a serial approach will work. • Experiment: …

Myth – Adding a batching algorithm increases latency • Characterization: Adding a batching algorithm increases latency • Hypothesis: Waiting for the batch to fill will always add latency Send end 10 10 concur concurrent ent mes messages ges to o an an IO • Deduction: If this is the case then we can never exceed the maximum rate at which de device ice wit ith h 100us 100us la latenc ency a serial approach will work. • Experiment: … 1. Batching can be implemented as a wait with a timeout 2. Send what is available as soon as possible then loop

Myth – Adding a batching algorithm increases latency • Characterization: Adding a batching algorithm increases latency • Hypothesis: Waiting for the batch to fill will always add latency • Deduction: If this is the case then we can never exceed the maximum rate at which a serial approach will work. • Experiment: … Min (us) Mean (us) Max (us) Serial 100 500 1000 Batch Type 2 100 190 200 • Little’s Law comes into play on points of serialisation

Technology Folklore Martin Thompson & Dave Farley - PowerPoint PPT Presentation

Technology Folklore Martin Thompson & Dave Farley http://code.google.com/p/disruptor/ http://www.davefarley.net http://mechanical-sympathy.blogspot.com/ Who are we? Disruptor Sample Folklore: Queues, an efficient way to exchange data

The folklore goes in the Diocese of Bristol that in March 2006, the But the folklore or that

Myths and Folklore Martin Thompson - @mjpt777 Top Performance Myths and Folklore Martin

Investigating Technical Debt Folklore Shedding Some Light on Technical Debt Opinion Rodrigo O.

Technology Folklore Martin Thompson - @mjpt777 http://mechanical-sympathy.blogspot.com/ A

Family Trees, Histories, and Stories: A Presentation Made to the Texas Folklore Society Annual

The American Dream Fraying of the Folklore ~Bruce Springsteen Allegretto Cal Teach-in Inequality

MNUO JUODARAGIS independent annual festival of post-folklore, alternative music and

The American Dream Fraying of the Folklore ~Bruce Springsteen What everyone is talking about

Strategy for 2015-2019 To improve the To support To support healthy infrastructure in Folklore

Minimum cuts via Breadth-First search R. Ravi ravi@cmu.edu Outline Minimum s-t cut in

The Folklore Theorem Return to the general mean-variance specification E( Y | x ) = f ( x ,

UDLS October 9, 2015 Sam Creed Elements of a Legend modern day folklore

Kyle Corbin Technology Project Lead,DODD 1 Supportive technology Technology that can support a

Cardinal Carter C.H.S. Technology November 2019 image goes here Grade 9 Technology Courses

A Philosophical Exploration of Artificial Intelligence Created By: Nathan Starkel AI has been

Company Presentation Culture Culture is everyones business Culture in its broad sense

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal

Logic as a Tool Chapter 2: Deductive Reasoning in Propositional Logic 2.1 Logical Deductive

Systems for Integrated Computation and Deduction The C ALCULEMUS IHP Training Network

Autumn 2020 Ling 5201 Syntax I 2: Syntax as deduction Robert Levine Ohio State University

Evaluation Techniques for Rule-Based Deduction in Dynamic Databases Joaqun Arias IMDEA

Parsing as Deduction Joseph K uhner March 24, 2007 Joseph K uhner Parsing as Deduction

Planning and Satisfiability Conclusion References Jussi Rintanen SAT-SMT School, Trento, June

Combining Deduction and Algebraic Constraints for Hybrid System Analysis Andr e Platzer