HINT: A New Way to Early computers had single instruction Measure - PDF document

Introduction (1 of 2) HINT: A New Way to • Early computers had single instruction Measure Computer stream Performance • Floating-point operations took longest • Thus, computer with higher flops per John L. Gustafson and Quinn. O. Snell second would be faster • Wasn’t linear (doubling flop/s didn’t quite halve execution time) but predictions were In Proceedings of the Fifth Annual Hawaii in the “right direction” International Conference on System Sciences • It doesn’t work anymore… (HICSS) 1995 1 2 Introduction (2 of 2) Outline • Most algorithms do more “data motion” • Introduction • Problems than arithmetic – And “data motion” is often the bottleneck • HINT • Growing rift in nominal speed (as • Net QUIPS determined by MIPS or MFLOPS) and • Examples actual application speed • Using memory bandwidth figures (say, in Mbytes/sec) too simplistic – Each memory layer (registers, primary cache, 2 nd -ary cache, main memory, disk …) has its own size and speed – Parallel memories make this problem worse 3 4 Failure of Other “Speed” Measures Failure of Other “Speed” Measures SPEC PERFECT • SPEC • PERFECT – Is popular – Benchmark suite – Not independent (is a consortium) – Has 100,000 lines of (semi-) standard – Has to be revised when “too small” for FORTAN workstations – Not widely used since converting the – Uses geometric ratio of the time reduction application is difficult of various kernels • Compare to base machine (was VAX-11/780) – Results available only for a handful of systems – But some VAX-11/780 systems have SPEC mark of 3! – “Survives because lack of credible alternatives” 5 6 1

Measuring Computer Speed Work, Work • Traditional measures of computer • But, since “work” is hard to define, keep it performance have little resemblance to constant and measure relative speeds other human endeavor fields – Divide one speed by another cancels numerator (work) and leaves ratios of time – Meters per second and reaction rate are “hard currency” for measuring speed that is – Avoids definition of work • Fixing program (work) problematic, since easily understood • But at a loss for performance of computing increased performance can attack larger method problems or get better quality answer • Only agreed measure is time – Users scale job to fit time to wait – So fix problem (work) and run on different – Don’t purchase 1000-processor system to do same job in 1/1000 th of the time! computers and see what is faster – speed is work / time 7 8 Possible Measures of Speed? (1 of 2) Possible Measures of Speed? (2 of 2) • MHz • VAX unit of performance – Universal indicator of speed for PCs – But, as SPEC shows, can vary by at least 3 • Ex: 3.2 GHz computer faster than 2.0 GHz • Mflop/sec – But if memory and hard-disk speeds are – No standard “floating point operation” since bottleneck, slower computer (2.0 GHz) can different computers have different errors run faster than faster computer (3.2 GHz) – No measure of how much progress on – Analogous to noting largest car computation, only what was done speedometer number and inferring performance – Ex: analogous to measuring speed of human • Solution? Definition of computational work runner by counting footsteps per second, where there is a quality of an answer ignoring how large the footsteps are – Quality Improvement per Second (QUIPS) 9 10 The Precedent of SLALOM (1 of 3) The Precedent of SLALOM (2 of 3) • SLALOM (Scalable, Language-independent, • Troubles Ames Laboratory, One-minute – Answer is “patches” (number of areas that geometry is divided into) Measurement) • ignores roundoff errors – Fixed time of radiosity 1 at one minute – Complexity was n 3 , n is number of patches • Published advances put this at n 2 – Asked how accurate an answer • Then, N log N method so hard to compare – Any answer, any architecture – Ease of use is one advantage of benchmark – Good because vendors could scale problem • Otherwise, just run target application! to power available � could show power- – SLALOM was 1000 lines, then 8000 lines ( n log n solving ability version) and then to parallelize took 1 graduate student year 1 To find the equilibrium radiation inside a box made of diffuse colored surfaces. The faces are divided into regions called "patches," the equations that determine 11 their coupling are set up, and the equations are solved for red, green, and blue 12 spectral components. 2

The Precedent of SLALOM (3 of 3) Outline • Troubles (continued) • Introduction • Problems – Was “forgiving” of machines with inadequate • HINT memory bandwidth • Net QUIPS – Did not run for 1 minute on computers with insufficient memory compared with • Examples arithmetic speed • Conversely, computers with large memories could not take advantage • Large memory related to application performance, even if not “speed” 13 14 The HINT Benchmark (1 of 2) The HINT Benchmark (2 of 2) • Obtain highest quality answer in least time • Hierarchical INTegration. • Quality increases as a step function of time – Fixes neither time nor problem size • Maintain a queue of intervals in memory to split • Find bounds on area for y=(1- • Split the intervals in order of largest removable x)/(1+x) and x[0:1] • Subdivide x and y by equal error power of two • Removable error by subdivision must be calculated • Count the squares – completely inside the area exactly when interval is subdivided. (lower bound) • Sort the resulting smaller errors into the last two – completely contain the entries in the queue area (upper bound) • Quality inversely proportional to (upper bound - lower bound) 15 16 Why this HINT? HINT Details • Adjusts to precision available • Proof (now shown) that hierarchical integration shows linear improvement – Unlimited scalability in that no • Tries to capture adaptive methods used by mathematical upper limit on quality – Only limit is precision, memory, speed of many applications computer • Lower limit is extremely low – Find largest contributor to error and refine • Benchmarks must have mathematically – About 40 operations give quality of 2.0 sounds results • A human can get that in a few seconds • ME: work example on board! • Quality attained in order N for order N storage and order N operations – Scaling is linear 17 18 3

HINT Example (1 of 3) HINT Example (2 of 3) • Given word size b d bits, x-axis represented • x = ½ then i=8, n x = 16, n y = 16 • n y (n x -i)/(n x +i) by b d /2 bits, yaxis b d /2 bits = 16(16-8)/(16+8) = 128/24 – Ex: d = 8 bits, so x-axis [0:15], y-axis [0:15] – Round down = 5, Round up = 6 • If n x and n x are numbers of area units • So, 5/16 < f(1/2) < 6/16 along x, y then – Compute (1-x)/(1+x) as n y (n x -i)/(n x +i) – Rounding up will be used for upper bound – Rounding down will be used for lower bound • Then divide by n y LB = 40, UB = 256 – 80 • 87 squares UL, 47 LR Quality = 256 / (136) • Should next sub-divide 87 = 1.88 19 20 HINT Example (3 of 3) Termination • If no loss in precision, quality then related to number of partitions • When width is one square or UB – LB < 2 squares then done � “insufficient precision” • Order N • A computer with • 2x QUIPS is 21 22 twice as powerful Memory Requirements Data Types • Must compute and store record of upper- • Can use floating points instead of integers lower bounding rectangle for each region – Roughly, 40 FLOPs per HINT iteration • Computers have roughly same QUIPS for – Left and right x values x l , xr – UB and LB different data types • If b d bits for data and b i bits for index – But specialized may do better. • Ex: scientific may have better QUIPS for – n iterations is (9b d +4b i )n bits • Note, program storage varies widely but floating point while business may have better QUIPS for integer should not be bottleneck – If want to stress instruction caching, do not use HINT 23 24 4

HINT: A New Way to Early computers had single instruction Measure - PDF document

Introduction (1 of 2) HINT: A New Way to Early computers had single instruction Measure Computer stream Performance Floating-point operations took longest Thus, computer with higher flops per John L. Gustafson and Quinn. O. Snell

HINT: Reflowing T EX output T EX and the rest is a mixture of good and bad luck. HINT:

T REASURE H UNT A PP Jon Frydman, Steven Lee, Eric Leong, HaKyung Yoon Hunt Hint Hint 2 I

introduction hint and puzzle HFLAV16 introduction hint and puzzle HFLAV16 realistic

A New Way of Medical A New Way of Medical A New Way of Medical A New Way of Medical

The Strategy Importance of Investing in Technology (Hint: Not This Technology!) Sue Kelly

Hint Orchestration Using ACL2's Simplifier Sol Swords Centaur Technology, Inc. ACL2 Workshop

CS4102 Algorithms Summer 2020 Warm up Show log ! = ( log ) Hint: show !

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

Deadline to implement E-Way Bill Basis Inter-Sate Intra -State Voluntary E-Way Bill 16-01-2018

United Way of Tompkins County United Way Inclusive United Way of Tompkins Community Worldwide

The Apache Way The Apache Way Nick Burch Nick Burch CTO, Quanticate CTO, Quanticate The

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Region 9/10 2016 Flu Season Old way of reporting Benefits to old way of reporting

Introducing the new Predator 68 New Predator 68 New Predator 68 New Predator 68 New Predator 68

INSIDE NVIDIA'S AI INFRASTRUCTURE FOR SELF-DRIVING CARS (HINT: ITS ALL ABOUT THE DATA)

Ethics & Social Media With A Hint Of Privacy Law Ethics Program Debra Bogo-Ernst, Mayer Brown

Games for eigenvalues of the Hessian and concave/convex envelopes. Julio D. Rossi (joint work

Oil & Gas Law Class 3: RoC: Common Law Limits and Correlative Rights 1 Last Week n

Estimating Frequency Moments Estimating F 0 Algorithm Correctness Further Anil Maheshwari

Everyone Will Bow LESSON 12 Your Response to the Lesson What was most interesting in the Bible

Matching and Price Theory MFI conference Chicago, May 2011 Al Roth Decentralized Matching

Colorado AAUW Public Policy Day Feb. 4, 2017 Martha King Martha.king@ncsl.org 303-856-1448 1

A fundamental inequality for the p-Laplacian and the -Laplacian Yi Ru-Ya Zhang ETH Z urich

Canonical Typology Danny Hieber Hieber, Daniel W. 2011. Canonical Typology. Talk given to the

HINT: A New Way to Early computers had single instruction Measure - PDF document

Introduction (1 of 2) HINT: A New Way to Early computers had single instruction Measure Computer stream Performance Floating-point operations took longest Thus, computer with higher flops per John L. Gustafson and Quinn. O. Snell

HINT: Reflowing T EX output T EX and the rest is a mixture of good and bad luck. HINT:

T REASURE H UNT A PP Jon Frydman, Steven Lee, Eric Leong, HaKyung Yoon Hunt Hint Hint 2 I

introduction hint and puzzle HFLAV16 introduction hint and puzzle HFLAV16 realistic

A New Way of Medical A New Way of Medical A New Way of Medical A New Way of Medical

The Strategy Importance of Investing in Technology (Hint: Not This Technology!) Sue Kelly

Hint Orchestration Using ACL2's Simplifier Sol Swords Centaur Technology, Inc. ACL2 Workshop

CS4102 Algorithms Summer 2020 Warm up Show log ! = ( log ) Hint: show !

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

Deadline to implement E-Way Bill Basis Inter-Sate Intra -State Voluntary E-Way Bill 16-01-2018

United Way of Tompkins County United Way Inclusive United Way of Tompkins Community Worldwide

The Apache Way The Apache Way Nick Burch Nick Burch CTO, Quanticate CTO, Quanticate The

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Region 9/10 2016 Flu Season Old way of reporting Benefits to old way of reporting

Introducing the new Predator 68 New Predator 68 New Predator 68 New Predator 68 New Predator 68

INSIDE NVIDIA'S AI INFRASTRUCTURE FOR SELF-DRIVING CARS (HINT: ITS ALL ABOUT THE DATA)

Ethics &amp; Social Media With A Hint Of Privacy Law Ethics Program Debra Bogo-Ernst, Mayer Brown

Games for eigenvalues of the Hessian and concave/convex envelopes. Julio D. Rossi (joint work

Oil &amp; Gas Law Class 3: RoC: Common Law Limits and Correlative Rights 1 Last Week n

Estimating Frequency Moments Estimating F 0 Algorithm Correctness Further Anil Maheshwari

Everyone Will Bow LESSON 12 Your Response to the Lesson What was most interesting in the Bible

Matching and Price Theory MFI conference Chicago, May 2011 Al Roth Decentralized Matching

Colorado AAUW Public Policy Day Feb. 4, 2017 Martha King Martha.king@ncsl.org 303-856-1448 1

A fundamental inequality for the p-Laplacian and the -Laplacian Yi Ru-Ya Zhang ETH Z urich

Canonical Typology Danny Hieber Hieber, Daniel W. 2011. Canonical Typology. Talk given to the

Ethics & Social Media With A Hint Of Privacy Law Ethics Program Debra Bogo-Ernst, Mayer Brown

Oil & Gas Law Class 3: RoC: Common Law Limits and Correlative Rights 1 Last Week n