lars bauer j rg henkel
play

Lars Bauer, Jrg Henkel - 1 - Lecture time: Mi., 15.45 - 17.15 - PowerPoint PPT Presentation

Institut fr Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel Vorlesung im SS 2014 Lars Bauer, Jrg Henkel - 1 - Lecture time: Mi., 15.45 - 17.15 Bld. 50.34, HS -102 Homepage: http://ces.itec.kit.edu/teaching/


  1. Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel Vorlesung im SS 2014 Lars Bauer, Jörg Henkel - 1 -

  2. � Lecture time: Mi., 15.45 - 17.15 Bld. 50.34, HS -102 � Homepage: http://ces.itec.kit.edu/teaching/ you can also find the slides from previous years there � Slides Login: Login: “student” Passwd : “CES - Student” � Contact: lars.bauer@kit.edu Haid-und-Neu-Str. 7 Bld. 07.21, Rm. 316.2 (2 nd Floor!!) - 2 - L. Bauer, KIT, 2014

  3. Info-Bau TFI Mensa - 3 - L. Bauer, KIT, 2014

  4. � Simply let me know / interrupt me - 4 - L. Bauer, KIT, 2014

  5. � CS Diploma: ◦ Vertiefungsfach 8: Entwurf eingebetteter Systeme und Rechnerarchitekturen � CS Master: ◦ Modul: Rekonfigurierbare und Adaptive Systeme [IN4INRAS] (3 ECTS) ◦ Modul: Eingebettete Systeme: Weiterführende Themen [IN4INESWTN] (10 ECTS) ◦ Modul: Advanced Computer Architecture [IN4INACA] (10 ECTS) � Other Study Courses (e.g. EE): ask individually - 5 - L. Bauer, KIT, 2014

  6. � Lectures � Seminars ◦ RAS ◦ Rekonfigurierbare Eingebettete Systeme ◦ Low Power Design ◦ Dependability in Embedded ◦ Embedded Systems for Systems Multimedia and Image Processing ◦ Distributed Decision Making � Labs ◦ Stereo Video Processing ◦ Entwurf eingebetteter ◦ Multicore for Multimedia Systeme Processors ◦ Entwurf von eingebetteten ◦ Sensor Networks applikationsspezifischen Prozessoren ◦ Low Power Design and Embedded Systems More Info: ces.itec.kit.edu/teaching - 6 - L. Bauer, KIT, 2014

  7. � Note: Info on homepage is typically not up-to-date ◦ If you are interested in a particular topic: better ask individually � There are nearly always SADABAMA theses or Hiwi jobs available in the scope of reconfigurable systems � Main projects: ◦ i -Core: invasive Core ◦ OTERA: Online Test Strategies for Reliable Reconfigurable Architectures ◦ Compilers for reconfigurable architectures � Topics: ◦ Algorithms for Runtime System, Operating System, … ◦ Toolchain , Compiler, Synthesis, … ◦ Architecture, Hardware Prototype, Simulation Environment, … - 7 - L. Bauer, KIT, 2014

  8. � Rechnerstrukturen ◦ Prerequisites � Eingebettete Systeme ◦ ES1: Optimierung und Synthese Eingebetteter Systeme ◦ ES2: Entwurf und Architekturen für Eingebettete Systeme ◦ The core topics (e.g. details about FPGA architectures) will be recapitulated in the scope of this lecture ◦ Thus, the contents of ES1 and ES2 are beneficial but not required in full detail - 8 - L. Bauer, KIT, 2014

  9. � “Fine - and Coarse- Grain Reconfigurable Computing”, S. Vassiliadis and D. Soudris, Springer 2007. � “ Runtime adaptive extensible embedded processors – a survey ”, H. P. Huynh and T. Mitra, SAMOS, pp. 215 – 225, 2009. � “Reconfigurable computing: architectures and design methods”, T.J. Todman et al., IEE Proceedings Computers & Digital Techniques, vol. 152, no. 2, pp. 193-207, 2005. � “Reconfigurable Instruction Set Processors from a Hardware/Software Perspective”, F. Barat et al., IEEE Transactions on Software Engineering, vol. 28, no. 9, pp. 847-862, 2002. - 9 - L. Bauer, KIT, 2014

  10. Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel 1. Introduction and Motivation: The Demand for Adaptivity - 10 -

  11. � Typical approach: ◦ Static analysis of system requirements (e.g. com- putational hot spots) ◦ Build optimized system � Today’s requirements: ◦ Increasing complexity ◦ More functionality � Problem: ◦ Statically chosen design point has to match all requirements ◦ Typically inefficient for individual components (e.g. tasks or hot spots) - 11 - - 11 - - 11 11 11 11 11 11 11 11 11 11 - 11 11 11 11 11 11 11 L. Bauer, KIT, 2014

  12. � A rather small part of the application that corresponds to a rather large part of the execution time ◦ Also called ‘Computational Kernel’ ◦ Typically: inner loop ◦ 80/20 rule (90/10 rule etc.) 20 80 20 80 Code Size Execution Time - 12 - L. Bauer, KIT, 2014

  13. Efficiency: Mips/$, MHz/mW, Mips/area , … “ Hardware solution ” ASIC: - Non-programmable, - highly specialized - Instruction set extension - parameterization - inclusion/exclusion of ASIP: Application functional blocks specific instruction set processor “ Software GPP: General pur- solution ” pose processor Flexibility, 1/time-to- market, … src: Henkel, ESII - 13 - L. Bauer, KIT, 2014 L. Bauer, KIT, 2014

  14. � Video En-/Decoding Remote Mic Phone CVBS CVHS Control � Audio En-/Decoding IR AUDIO INPUT VIDEO INPUT � Data (De-)Multi- Digital Video Input plexing AUDIO ENCODER VIDEO ENCODER H.245 CONTROL G.723 H.263 / H.264 � Control protocol MULTIPLEXER H.223 DE-MULTIPLEXER H.223 AUDIO DECODER VIDEO DECODER H.245 CONTROL G.723 H.263 / H.264 MODEM PSTN AUDIO OUPUT VIDEO OUPUT INTERFACE src: cityrockz.com Display Line Phone Speakers Screen - 14 - L. Bauer, KIT, 2014

  15. 12 10 Processing Time [%] 8 6 4 2 0 E E V L L C F L C 6 4 C C B L d M Q P t t D C C A S F E t A M M d C s d c 0 F F F C B Q D s M I L _ 1 I o P _ n _ N P P R G M M P _ _ _ A L M _ F U E B T B C E _ o S A _ n _ D Q C L p C C E d _ C D V _ L H D 5 M M _ _ P Q B o F 3 U _ _ _ _ Q e Q M M A A L 4 2 I S T M A 3 4 T c r c P t 2 B V 2 T C e e D P e 2 3 C I H H A 2 V D g I A R H C C Processing Functions - 15 - L. Bauer, KIT, 2014

  16. � Design accele- rators for the hot spots � Connect them as Execution Units, Register Files, and Interfaces src: Tensilica, Inc.: “ Xtensa LC Product Brief” - 16 - L. Bauer, KIT, 2014

  17. � Provides noticeably improved performance after targe- ting the ma- jor hot spots I_ME � However, performance TQ_PL still not suf- ficient to achieve real- MC_L time require- ments ◦ More hot spots need to be accelerated src: Tensilica, Inc.: “ Xtensa LC Product Brief” - 17 - L. Bauer, KIT, 2014

  18. CAVLC � Scalability CABAC problem when rather many hot- pots exist ◦ Note: still not I_ME all relevant hot spots are covered TQ_PL Dec_ MC_L MB H245_C MAC FM V34 mod S_ME src: Tensilica, Inc.: “ Xtensa LC Product Brief” - 18 - L. Bauer, KIT, 2014

  19. � ASIPs perform well when 1. rather few hot spots need to be accelerated and 2. those hot spots are well known in advance � ASIPs are less efficient when targeting rather many hot spots ◦ All accelerators are provided statically (i.e. they require area and consume power) even though typically just a few of them are needed at a certain time � ASIPs are less efficient when targeting unknown hot spots ◦ Even for a given application it is not necessarily clear, which parts of it are ‘hot’ during execution as this may depend on input data (as demonstrated in the following) - 19 - L. Bauer, KIT, 2014

  20. MB Encoding Loop � MB-Type Decision (I or P) � Mode Decision (for I or P) DCT / IDCT / If MB_Type = P_MB MC Loop Over MB Loop Over MB Loop Over MB Q IQ Blocking Filter then In-Loop De- ME: SA(T)D Encoding RD CAVLC Engine else DCT / IDCT / IPRED HT / Q IHT / IQ � Iterates on MacroBlocks (M MBs, i.e. 16x16 pixels) � 2 different MB-types � different computational paths with different computational requirements ◦ I-MB (spatial prediction) ◦ P-MB (temporal prediction) - 20 - L. Bauer, KIT, 2014

  21. I-MB P-MB Note: 16x16 MBs can be partitioned into sub- MBs, e.g. 16x8, 8x8, down to 4x4 - 21 - L. Bauer, KIT, 2014

  22. Rugby Rafting Football 100% 90% 80% Scene with Very INTRA MB in a Frame [%] High Motion 70% 60% 50% Scene with Medium- to-Slow Motion 40% 30% 20% Scene with High-to- Medium Motion 10% 0% 1 1 21 21 41 41 61 61 81 81 101 121 141 161 181 201 221 241 261 281 301 Frame Number - 22 - L. Bauer, KIT, 2014 L. Bauer, KIT, 2014

  23. � Even for a well known application it is not always clear which parts will be ‘hot’ (e.g. according computational complexity) and thus benefit from accelerators ◦ This depends on changing input data and control flow � Even more complex: multi-tasking scenarios ◦ Not clear, which applications will execute at the same time ◦ Not clear, which applications will execute at all (user can download new applications) ◦ This significantly increases the number of potential hot spots � hardly possible to address this with an ASIP � Systems that fulfill the demand for adaptivity may lead to ◦ Better performance (absolute criteria) ◦ Higher Efficiency (relative criteria e.g. performance per area etc.) ◦ Lower cost (no redesign if specifications change, no overdesign to cover all scenarios) - 23 - L. Bauer, KIT, 2014

  24. , MIPS/area , … “ Hardware solution ” ASIC: Reconfigurable - Non-programmable, and Adaptive - highly specialized Efficiency: MIPS/$, MHz/mW Systems ASIP: Application tion specific instruction ion set processor r “ Software GPP: General pur- solution ” pose processor E F Flexibility, 1/time-to- market, … - 24 - L. Bauer, KIT, 2014 L. Bauer, KIT, 2014

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend