reconfigurable and adaptive systems ras
play

Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - - PDF document

Institut fr Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel Vorlesung im SS 2012 Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - 1 - Organisation Lecture time: Mon., 14.00 - 15.30 Bld.


  1. Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel Vorlesung im SS 2012 Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jörg Henkel - 1 - Organisation � Lecture time: Mon., 14.00 - 15.30 Bld. 50.34, HS -101 � Homepage: http://ces.itec.kit.edu/ � Teaching � Slides Login: Login: “student” Passwd: “CES-Student” � Contact: lars.bauer@kit.edu Haid-und-Neu-Str. 7 Bld. 07.21, Rm. 316.2 (2 nd Floor!!) - 2 - L. Bauer, CES, KIT, 2012

  2. CES @ Technologiefabrik (TFI) Info-Bau TFI Mensa - 3 - L. Bauer, CES, KIT, 2012 RAS Examine � CS Diploma: ◦ Vertiefungsfach 8: Entwurf eingebetteter Systeme und Rechnerarchitekturen � CS Master: ◦ Modul: Rekonfigurierbare und Adaptive Systeme [IN4INRAS] (3 ECTS) ◦ Modul: Eingebettete Systeme: Weiterführende Themen [IN4INESWT] (8 ECTS) ◦ Modul: Advanced Computer Architecture [IN4INACA] (10 ECTS) � Other Study Courses (e.g. EE): ask individually - 4 - L. Bauer, CES, KIT, 2012

  3. Teaching @ CES, SS 2012 ◦ Embedded Multimedia � Lectures ◦ Wireless Sensor Networks ◦ RAS ◦ Processor Modeling at ◦ Low Power Design Transaction-Level � Labs ◦ Low Power Design for ◦ Entwurf eingebetteter Systeme Embedded Systems ◦ Entwurf von eingebetteten ◦ Rekonfigurierbare applikationsspezifischen Eingebettete Systeme Prozessoren ◦ Design Tools for Embedded ◦ Software-Entwicklung Processors � Seminars ◦ Dependability in Embedded Systems ◦ Distributed Decision Making ◦ Dependable Embedded ◦ Organic Computing Software ◦ Stereo Video Processing ◦ Multicore for Multimedia Processors More Info: ces.itec.kit.edu/teaching - 5 - L. Bauer, CES, KIT, 2012 Theses @ CES � Note: Info on homepage is typically not up-to-date ◦ If you are interested in a particular topic: better ask individually � There are nearly always SADABAMA theses or Hiwi jobs available in the scope of reconfigurable systems � Main projects: ◦ i-Core (invasive Core) ◦ OTERA (Online Test Strategies for Reliable Reconfigurable Architectures) � Topics: ◦ Hardware Prototype ◦ Simulation Environment / Algorithms for Runtime System � Examples: Fault Emulation, adaptive redundancy schemes, online monitoring, bitstream manipulation, multicore integration, compiler tools, … - 6 - L. Bauer, CES, KIT, 2012

  4. Beneficial Previous Knowledge � Rechnerstrukturen ◦ Prerequisites � Eingebettete Systeme ◦ ES1: Optimierung und Synthese Eingebetteter Systeme ◦ ES2: Entwurf und Architekturen für Eingebettete Systeme ◦ The core topics (e.g. details about FPGA architectures) will be recapitulated in the scope of this lecture ◦ Thus, the contents of ES1 and ES2 are beneficial but not required in full detail - 7 - L. Bauer, CES, KIT, 2012 General Literature � “Fine- and Coarse-Grain Reconfigurable Computing”, S. Vassiliadis and D. Soudris, Springer 2007. � “Runtime adaptive extensible embedded processors – a survey”, H. P. Huynh and T. Mitra, SAMOS, pp. 215–225, 2009. � “Reconfigurable computing: architectures and design methods”, T.J. Todman et al., IEE Proceedings Computers & Digital Techniques, vol. 152, no. 2, pp. 193-207, 2005. � “Reconfigurable Instruction Set Processors from a Hardware/Software Perspective”, F. Barat et al., IEEE Transactions on Software Engineering, vol. 28, no. 9, pp. 847-862, 2002. - 8 - L. Bauer, CES, KIT, 2012

  5. Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel Reconfigurable and Adaptive Systems (RAS) 1. Introduction and Motivation: The Demand for Adaptivity - 9 - Designing Embedded Systems � Typical approach: ◦ Static analysis of system requirements (e.g. com- putational hot spots) ◦ Build optimized system � Today’s requirements: ◦ Increasing complexity ◦ More functionality � Problem: ◦ Statically chosen design point has to match all requirements ◦ Typically inefficient for individual components (e.g. tasks or hot spots) - 10 - - 10 - L. Bauer, CES, KIT, 2012

  6. Definition ‘Computational Hot Spot’ � A rather small part of the application that corresponds to a rather large part of the execution time ◦ Also called ‘Computational Kernel’ ◦ Typically: inner loop ◦ 80/20 rule (90/10 rule etc.) 20 80 20 80 Code Size Execution Time - 11 - L. Bauer, CES, KIT, 2012 Typical Implementation Alternatives Efficiency: Mips/$, MHz/mW, Mips/area, … “ Hardware solution ” ASIC: - Non-programmable, - highly specialized - Instruction set extension - parameterization - inclusion/exclusion of ASIP: Application functional blocks specific instruction set processor “ Software GPP: General pur- solution ” pose processor Flexibility, 1/time-to-market, … src: Henkel, ESII - 12 - L. Bauer, CES, KIT, 2012

  7. Example Application: H.324 Video Conferencing Remote � Video En-/Decoding Mic Phone CVBS CVHS Control � Audio En-/Decoding IR AUDIO INPUT VIDEO INPUT � Data (De-)Multi- Digital Video Input plexing AUDIO ENCODER VIDEO ENCODER H.245 CONTROL G.723 H.263 / H.264 � Control protocol MULTIPLEXER H.223 DE-MULTIPLEXER H.223 AUDIO DECODER VIDEO DECODER H.245 CONTROL G.723 H.263 / H.264 MODEM PSTN AUDIO OUPUT VIDEO OUPUT INTERFACE Display src: cityrockz.com Line Phone Speakers Screen - 13 - L. Bauer, CES, KIT, 2012 Hotspots in H.324 Video Conferencing 12 10 Processing Time [%] 8 6 4 2 0 I_ME S_ME PMV TQ_PL TQ_IL TQ_C LF MC_L MC_C IP_L16 MD_I4 CABAC CAVLC Dec_MB get_pos IDQ_PL CABAC_d CAVLC_d FM Q UP Enc Qt Pred_0 Reconst ED BC TC BA CS NF LPF HPF EE DRF Dt FGA H245_C H223_M H223_DM V34Mod USB MAC Processing Functions - 14 - L. Bauer, CES, KIT, 2012

  8. ASIP Implementation � Design accele- rators for the hot spots � Connect them as Execution Units, Register Files, and Interfaces src: Tensilica, Inc.: “Xtensa LC Product Brief” - 15 - L. Bauer, CES, KIT, 2012 ASIP Implementation (cont’d) � Provides noticeably improved performance after targe- ting the ma- jor hot spots I_ME � However, performance TQ_PL still not suf- ficient to achieve real- MC_L time require- ments ◦ More hot spots need to be accelerated src: Tensilica, Inc.: “Xtensa LC Product Brief” - 16 - L. Bauer, CES, KIT, 2012

  9. ASIP Implementation (cont’d) CAVLC � Scalability CABAC problem when rather many hot- pots exist ◦ Note: still not I_ME all relevant hot spots are covered TQ_PL Dec_ MC_L H245_C MB MAC FM V34 mod S_ME src: Tensilica, Inc.: “Xtensa LC Product Brief” - 17 - L. Bauer, CES, KIT, 2012 ASIP Implementation (cont’d) � ASIPs perform well when 1. rather few hot spots need to be accelerated and 2. those hot spots are well known in advance � ASIPs are less efficient when targeting rather many hot spots ◦ All accelerators are provided statically (i.e. they require area and consume power) even though typically just a few of them are needed at a certain time � ASIPs are less efficient when targeting unknown hot spots ◦ Performance degenerates to the performance of a GPP ◦ Note that even for a given application it is not necessarily clear, which parts of it are ‘hot’ when executing as this may depend on input data (as demonstrated in the following) - 18 - L. Bauer, CES, KIT, 2012

  10. Example Application: H.264 video Encoder MB Encoding Loop � MB-Type Decision (I or P) � Mode Decision (for I or P) DCT / IDCT / If MB_Type = P_MB MC Loop Over MB Loop Over MB Loop Over MB Blocking Filter Q IQ then In-Loop De- ME: SA(T)D Encoding RD CAVLC Engine else DCT / IDCT / IPRED HT / Q IHT / IQ � Iterates on MacroBlocks (M MBs, i.e. 16x16 pixels) � 2 different MB-types � different computational paths with different computational requirements ◦ I-MB (spatial prediction) ◦ P-MB (temporal prediction) - 19 - L. Bauer, CES, KIT, 2012 Example: Football Video I-MB P-MB Note: 16x16 MBs can be partitioned into sub- MBs down to 4x4 - 20 - L. Bauer, CES, KIT, 2012

  11. Example: Distribution of I-MBs in Medium-to-VeryHigh Motions Rafting Rugby Football 100% 90% 80% Scene with Very INTRA MB in a Frame [%] High Motion 70% 60% 50% Scene with Medium- to-Slow Motion 40% 30% 20% Scene with High-to- 10% Medium Motion 0% 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 Frame Number - 21 - L. Bauer, CES, KIT, 2012 Example: Changing Energy Con- sumption at Frame- and MB Level 35 Energy Consumption [µWs] Carphone_QCIF 30 Clair_QCIF 25 SusieTable_QCIF 20 15 10 5 0 0 20 40 60 80 100 120 140 Frame Number 9 0.4-0.5µWs 8 0.3-0.4µWs 7 0.2-0.3µWs 6 0.1-0.2µWs 0.0-0.1µWs 5 4 3 2 1 Frame#99 Frame#100 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 - 22 - L. Bauer, CES, KIT, 2012

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend