hete terog ogene neous c ous conc oncur urrenc ncy
play

Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. - PowerPoint PPT Presentation

Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. Scott (on leave at Google Madison) www.cs.rochester.edu/u/scott/ Schloss Dagstuhl January 2015 Future Future proc processors ssors may not be y not be pre pretty tty


  1. Hete terog ogene neous C ous Conc oncur urrenc ncy Michael L. Scott (on leave at Google Madison) www.cs.rochester.edu/u/scott/ Schloss Dagstuhl January 2015

  2. Future Future proc processors ssors may not be y not be pre pretty tty ● Specter of “dark silicon”: most of the chip may need to be “off” most of the time ● Architects likely to fill the space with customized circuits » compression, encryption, XML parsing, pattern matching, media transcoding, vector/matrix algebra, arbitrary precision math, FFT, even FPGA ... » Not to mention cores with different computational/energy tradeoffs ● “Typical” program may need to jump frequently from one core to another MLS � 2 �

  3. Progre Progression of func ssion of functiona tionality lity ● FPU: pure simple function (e.g., arctan) » protection not really an issue ● GPU: fire-and-forget rendering ● GPGPU: compute and return (with memory access) » direct access from user space » one protection domain at a time ● first-class core: juggle multiple contexts safely » preemption, multiprogramming MLS � 3 �

  4. How do we ow do we... ... ● arbitrate access to resources (cycles, scratchpad memory, bandwidth, ...) » what do we need in HW that we don’t have now? ● choose among cores with non-trivial tradeoffs (speed, power, energy, load) ● access system services on nontraditional cores ● balance computational ability v. locality » how fast can we stream data from core to core? ● accommodate heterogeneous ISAs (esp. if choosing among cores on which these differ) MLS � 4 �

  5. And ( nd (w.r.t w.r.t. c . conc oncurre urrenc ncy), y), how do we... how do we ... ● dispatch across cores (HW queues? flat combining?) ● manage stacks (contiguous v. linked frames) ● wait for completion (spin? yield? deschedule? ship continuations?) ● avoiding writing code in a different language for every accelerator ● unblock threads across cores? across languages? » connections here to Eliot’s talk MLS � 5 �

  6. (U (Unsupporte nsupported) H d) Hypothe ypothese ses ● Traditional kernel interface will not suffice » must expose more of the underlying architecture, so run-time systems can figure out what to do » must not make everything a pthread [ Capriccio, Akaros, ... ] ● Contiguous stack frames will not suffice; neither will proliferating languages » compiler help will be required ● “Accelerator” cores will need “first-class status” » ability to request OS services directly [ GPUfs, ... ] ● Tree-structured dynamic call graph will be too restrictive » will sometimes want to “return” elsewhere than whence we came (continuation shipping) MLS � 6 �

  7. Ple Plenty to k nty to keep us b p us busy! usy! www.cs.rochester.edu / u / scott /

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend