architecture and design methodology for
play

Architecture and Design Methodology for Autonomic Systems-on-Chip - PowerPoint PPT Presentation

Architecture and Design Methodology for Autonomic Systems-on-Chip (ASoC) A. Bernauer, A. Bouajila, J. Zeppenfeld, W. Stechele, O. Bringmann, A. Herkersdorf, W. Rosenstiel Universitt Tbingen Technische Universitt Mnchen FZI -


  1. Architecture and Design Methodology for Autonomic Systems-on-Chip (ASoC) A. Bernauer, A. Bouajila, J. Zeppenfeld, W. Stechele, O. Bringmann, A. Herkersdorf, W. Rosenstiel Universität Tübingen Technische Universität München FZI - Forschungszentrum Informatik

  2. Project Reminder Functional Autonomic SoC elements SoC elements Autonomic Element Application Architecture Requirements & AUTONOMIC Layer Characteristics Characteristics Performance Reliability Power FE/AE Parameter Selection Optimization Architecture FE/AE Model Evaluation Functional FUNCTIONAL Layer Element October 7, 2010 ASoC - Architecture and Design Methodology for SoC 2

  3. Phase 3 Work Packages Distributed XCS Demonstrator Combining XCS and LCT Estimating minimum population size Run-time reliability optimization October 7, 2010 ASoC - Architecture and Design Methodology for SoC 4

  4. Distributed XCS • Cooperating XCS instances: benefits for self-adaptation? • Simplified Cell processor, different cooling properties for each core • Varying ambient temperature and activity • Goal: maximum performance, but no timing errors • Configurations – Topology: uni-, bidirectional, complete graph – Emigration strategy: random, numerosity, prediction error – Deletion strategy: by fitness, prediction error – Don’t care probability: 0.1, 0.3, 0.6, 0.9 October 7, 2010 ASoC - Architecture and Design Methodology for SoC 5

  5. Isolated XCS P # =0.3 P # =0.9 SXU2 October 7, 2010 ASoC - Architecture and Design Methodology for SoC 6

  6. Distributed XCS Topology: complete Emigration: selecting by fitness Deletion: selecting by prediction error P # =0.3 P # =0.9 SXU2 October 7, 2010 ASoC - Architecture and Design Methodology for SoC 7

  7. Phase 3 Work Packages Combining XCS and LCT October 7, 2010 ASoC - Architecture and Design Methodology for SoC 8

  8. Combining XCS and LCT [BICC10] Initial rule set Rule-set translation Design-time Run-time rule set rule set Fitness Rule set Fitness Rule set update update update update Action Action HW SW Run-time learning (LCT) Design-time learning (XCS) Hardware Software Goals: • retain capability to self-adapt to unforeseen events • low hardware overhead October 7, 2010 ASoC - Architecture and Design Methodology for SoC 9

  9. Configurations and benchmarks [BICC10] • Rule translation – all-XCS: translate all XCS rules to LCT rules – top-XCS: translate only top XCS rules • Action selection – roulette-wheel: randomly, reward-weighted – winner-takes-all: highest reward • Benchmarks 0 1 2 – Multiplexer – Task allocation 3 4 5 – SoC component parameterization 6 7 8 task allocation on 9 cores October 7, 2010 ASoC - Architecture and Design Methodology for SoC 10

  10. Task allocation benchmark [BICC10] L: Number of cores i: Number of tasks October 7, 2010 ASoC - Architecture and Design Methodology for SoC 11

  11. Task allocation benchmark [BICC10] L: Number of cores i: Number of tasks Single core failure in LCT Double core failure in LCT  Self-adaptation at chip level with all-XCS rule translation and winner-takes-all action selection strategy October 7, 2010 ASoC - Architecture and Design Methodology for SoC 12

  12. Phase 3 Work Packages Estimating minimum population size October 7, 2010 ASoC - Architecture and Design Methodology for SoC 13

  13. Estimating minimum population size N [IJCNN10] • State of the art: Calculate N for “regular” problems – “Regular” problem: constant number of relevant bits per problem instance – Example: Multiplexer problem 01 1101  0 (3 relevant bits) • Our extension: Calculate N for “complex” problems – “Complex” problem: variable number of relevant bits per problem instance – Example: 2-out-of-4 task allocation problem 1100  2 (2 relevant bits) allocation possible allocation impossible 1110  0 (3 relevant bits) October 7, 2010 ASoC - Architecture and Design Methodology for SoC 14

  14. Estimating minimum population size N [IJCNN10] Experimental results: → Estimated N is an upper bound; Performance penalty if using smaller N (e.g., when using only 0.75N, only 70% of the problem instances have correctness rate >90%) October 7, 2010 ASoC - Architecture and Design Methodology for SoC 15

  15. Trading classifiers for accuracy [IJCNN10] • Idea: subsume classifiers during rule translation, after learning • Higher correctness rate than when subsuming during learning (SASO [9]) • Comparable population size with SASO [9] October 7, 2010 ASoC - Architecture and Design Methodology for SoC 16

  16. Phase 3 Work Packages Run-time reliability optimization October 7, 2010 ASoC - Architecture and Design Methodology for SoC 17

  17. Run-time reliability optimization Modular redundancy Series October 7, 2010 ASoC - Architecture and Design Methodology for SoC 18

  18. Phase 3 Work Packages Demonstrator October 7, 2010 ASoC - Architecture and Design Methodology for SoC 19

  19. Demonstrator – Hardware • Monitors – Frequency (3 bit) Learning Classifier Table Condition Action Fitness – Utilization (3 bit) 1 X X 0 : 1001 11 Fitness – Workload difference (2 bit) X 0 1 X : 1010 3 Update . . . . . . . . . • X 1 0 X : 1100 15 Actuators – Frequency (4 bit) – Task migration (1 bit) Core1 Core2 Core3 • Evaluator Bus – Learning classifier system UART MEM adapted for efficient HW MAC implementation • Communicator – Sharing of global information – Migration of tasks October 7, 2010 ASoC - Architecture and Design Methodology for SoC 20

  20. Demonstrator – Tasks • Test tasks 1-N – Adjustable, synthetic workloads Learning Classifier Table T3 – After completion, pass data to Condition Action Fitness next task 1 X X 0 : 1001 11 T2 T5 Fitness X 0 1 X : 1010 3 • Update User interface . . . . . . . . . X 1 0 X : 1100 15 T4 T1 – Allow for user interaction • Data generation – Replaces packet reception for Core1 Core2 Core3 standalone demonstration – Generates new data packet Bus every 100µs UART MEM • UART MAC – Pass buffered output data to the UART October 7, 2010 21 ASoC - Architecture and Design Methodology for SoC

  21. Demonstration

  22. Hardware Overheads Flip-Flops LUTs BRAMs Mult. Overhead – Leon3 1749 8936 28 1 Leon3 AE 2122 10213 29 2 14.3% LCT 66 116 1 1 1.4% Act Task. 57 299 0 0 3.5% Act Freq. 7 19 0 0 0.2% Mon Util. 35 74 0 0 0.8% Mon Load 20 40 0 0 0.5% AE IF 173 399 0 0 4.5% Synthesis results for Xilinx Virtex 4 VLX100 October 7, 2010 ASoC - Architecture and Design Methodology for SoC 23

  23. Phase 3 Progress • 3 rd LCS-Workshop, Tübingen, July 1-2, 2010 October 7, 2010 ASoC - Architecture and Design Methodology for SoC 24

  24. Future work • Continue work on Autonomic Layer reliability • Complete dependability assessment • Reduce simulation time due to temperature estimation • Extend theoretical analysis to run-time system • Complete work on demonstrator October 7, 2010 ASoC - Architecture and Design Methodology for SoC 25

  25. Summary • Distributed XCS, solves previously difficult to solve problems • Combining Software XCS with Hardware LCT for Lightweight On-Chip Learning • Estimating minimum population size N • Trading classifiers for accuracy • Demonstrator runs 3 Leon3 cores, distributes tasks and adjusts frequency autonomously October 7, 2010 ASoC - Architecture and Design Methodology for SoC 26

  26. Recent Publications • [ARCS10] J. Zeppenfeld, A. Herkersdorf. Autonomic Workload Management for Multi-Core Processor Systems , ARCS, Hannover, Germany, February 22-25, 2010. • [SORT10] J. Zeppenfeld, A. Bouajila, A. Herkersdorf, W. Stechele. Towards Scalability and Reliability of Autonomic Systems on Chip , 1 st IEEE Workshop on Self-Organizing Real-Time Systems, Carmona, Spain, May 4, 2010. • [IJCNN10] B. Rakitsch, A. Bernauer, O. Bringmann, W. Rosenstiel. Pruning population size in XCS for complex problems, International Joint Conference on Neural Networks at the World Congress on Computational Intelligence (WCCI), Barcelona, Spain, July 18-23, 2010. • [BICC10] A. Bernauer, J. Zeppenfeld, O. Bringmann, A. Herkersdorf, W. Rosenstiel. Combining software and hardware LCS for lightweight on-chip learning, DIPES/BICC 2010, IFIP AICT 329, p.279-290, Brisbane, Australia, September 20-23, 2010. October 7, 2010 ASoC - Architecture and Design Methodology for SoC 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend