KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH)
Systems Dod Org SPP OC Kolloquium DFG SPP 1183 Organic Computing - - PowerPoint PPT Presentation
Systems Dod Org SPP OC Kolloquium DFG SPP 1183 Organic Computing - - PowerPoint PPT Presentation
Digital On-Demand Computing Organism for Real-time Systems Dod Org SPP OC Kolloquium DFG SPP 1183 Organic Computing Nrnberg, September 15/16, 2011 KIT The cooperation of Forschungszentrum Karlsruhe GmbH and Universitt Karlsruhe (TH)
Talk Overview Motivation and Overview Current Work Phase III:
Organic Hardware Organic Monitoring Organic Low Power Management Organic Middleware
DodOrg Demonstrator Platform
Interaction and Overview Scenarios and Results
Conclusion
2 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011 9/16/2011
DodOrg Motivation
Classic Scenario:
Only those scenarios can be handled That were considered in advance, Where the cause can be detected, Where the corresponding reaction had been explicitly programmed. Lack of adaptation leads to insufficient reactions (e.g. shutdown …)
DodOrg Scenario:
System reaction based on indications (higher level of abstraction) E.g. CRC/bit error rate, network bottleneck, environmental change or change on application level Proper reaction possible even if Scenario was not considered in advance Cause was not detected Reaction was not explicitly programmed Flexible response to changed environmental situation Scenario detection: recognize that something is different Adapt to changed requirements either by known path or gradual process of rearrangement (optimization, healing)
Demonstrator platform
9/16/2011 3 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
DodOrg: Refined Layer Model
Brain Level Organ Level Cell Level
Myo- cardial Cell Nervous System Application API Application Monitoring Hardware Monitoring Middleware Monitoring, Feedback Application Testbed (all groups)
Organic Middleware (Brinkschulte) Organic Processing Cells (Becker) Distributed Low Power Management (Henkel) Biological Considerations (Brändle) Organic Monitoring System (Karl)
Heart Hormone Level Computation Dynamic Power Management Real-time considerations Temperature, Local Traffic
Proactive Intelligent Data Analysis Self-Adaptation Self-Optimization Self-Healing Stable Hormone Interaction Thermal-aware Energy distribution OPC Extension
Stability Aspects
9/16/2011 4 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Phase III: Project Objectives
Robustness Extending the stable system property towards more serious system changes . Stability The ability of the system to provide the required service while reacting upon external and internal events.
+ Fault resistance + Increased tolerance
- Increased overhead
+ Oscillation avoidance + Normal operating conditions
- Faulty components
9/16/2011 5 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
9/16/2011 6 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011 FPGA Cell DSP Cell I/O Cell Memory Cell Monitor Cell FPFA Cell µ Proc Cell I/O Cell Peripheral Devices Heterogeneous Array of Organic Processing Cells (OPCs) artNoc
- broadcast
- real time
- adaptive routing
OPC with common structure but with specific functionality FPGA Cell
Organic Hardware Approach (Prof. Becker)
Modularity
Same blueprint for all OPCs Common infrastructure Cells easily replaceable
Local intelligence
artNoC Router Power-Management Monitoring Configuration-Management
Interfaces
Monitoring Middleware Low Power Management
Flexibility
Reconfigurable data path
Clkglobal Clklocal Cell-Specific Functionality (μProc, DSP, FPGA,FPFA, Memory, Monitoring, etc.) Clock and Power Management (DVFS) Configuration Control State Interface Configuration Cache Observer Network Interface N E S W L
Power Status Power Control Monitor Status Observer Control Cell Data path Monitoring Data Emergency Calls Messenger Channel Allocation/Release Configuration
Low Level Monitoring
artNoC Router
9/16/2011 7 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Processing Cells: Robustness (Prof. Becker)
Robustness during development phase Scope Loading new configuration Establish-inter-cell-data path Power up/down cells Method: On-demand hardware monitoring Blank configuration pattern Robustness during processing phase Scope OPC-data path (packet sender) OPC to OPC communication path (artNoC-Network) Goal: hardware support for cell immune system Method: artNoC header packet protection Channel auto release
OPC-Lifecycle
Development Phase: Reconfiguration Processing Phase: Calculation Ongoing Change
9/16/2011 8 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
On-Demand Hardware Monitoring (Prof. Becker)
T1 T2 T2 T3 T4 Oscillator Macro Horizontal Routing Vertical Routing Routing Base T5 T1 T6
- Memory requirement (T1-T6, OSC) : 302 Bytes
- 80µs/ routing connection
- 880µs/ OSC macro
- Reconfiguration time: 1.5ms (8 bit ICAP)
- Suitable to monitor large areas
Challenge: Fault introduced communication deadlock OPC-2-OPC communication: Wormhole Flow Control
Flexible Low buffer requirements Packet spread across several routers Failure/attack affects several routers
Protect control flits:
Corrupted header-flit misrouting of packets Checksum Missing tail/header-flit blocked virtual channels If downstream cell – release channel via feedback line If upstream cell – inject tail flit to release channel with error code
Organic Processing Cells: Robustness (Prof. Becker)
9/16/2011 9 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
- ca. 110 additional Slices for Control-Flit Protection
Avoidance of blocked Virtual Channels (VCs) Avoidance of misrouting packets
Synthesis Result: ArtNoC Router on Xilinx FPGA (Prof. Becker)
9/16/2011 10 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
2VCs 3VCs 4VCs 500 1000 1500 2000 2500 3000 3500 4000 2VCs 3VCs 4VCs
Slices RT: Real-Time FB: Feedback Channel WF: West-First Routing BC: Broadcast FA: Full adaptive Routing RC: Packet Recovery PR: Control-Flit Protection
Organic Monitoring (Prof. Karl)
Objectives
Coordinated, cooperative and system-wide monitoring Fundamental for self-organizing systems Providing monitored data to Organic Middleware and Thermal Management for further analysis Providing self-awareness
Self-Awareness
Prerequisite for all self-X features Ability of system state determination Ability of system state classification Permitting the comparison of two arbitrary system states
Application API Application Monitoring Hardware Monitoring Middleware Monitoring, Feedback Application Testbed (all groups) Organic Middleware (Brinkschulte) Organic Processing Cells (Becker) Distributed Low Power Management (Henkel)
Biological Considerations (Brändle) Organic Monitoring System (Karl)
Hormone Level Computation Dynamic Power Management Real-time considerations Temperature, Local Traffic
Proactive Intelligent Data Analysis Self-Adaptation Self-Optimization Self-Healing Stable Hormone Interaction Stable Energy Distribution OPC Interaction, Metrics
Application Hardware Monitoring Raw & Cooked Data Feedback Configuration Requirements Status Status Middleware Thermal Management
9/16/2011 11 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Monitoring: State Evaluation (Prof. Karl)
Using a rule-based approach
Learning evaluation rules in a dedicated training phase Defining a normal system state Comparison of further system states with the normal state
Properties
Rules convert an occurrence ratio of an event into a fitness value One rule per event / hormone Weighted arithmetic mean for determ- ining the fitness value for the system Different states then can be compared by comparing the fitness values
9/16/2011 12 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Monitored Value Monitored Value
9/16/2011 13 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Monitoring: State Classification (Prof. Karl)
Using k-means clustering for definition of individual system state at runtime Treating all available event occurrences as a point in a n-dim space Clustering of close points within this space Using euclidean distance for online state detection
9/16/2011 14 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Monitoring: Phase Detection and Prediction (Prof. Karl)
Goals: Prediction of future system states Identification and avoiding of potentially harmful system states State Prediction: Using a runlength-encoded Markov chain as predictive model Trained in a dedicated learning phase using the previously classified system states
Past System States Past System States Past System States Current System State Predicted System States A B C D
Organic Low Power Management: Managing Energy-Distribution (Prof. Henkel)
Cost Function Cost Function
Organic Middleware
Influencing Hormone Expression
Power / RT Manager Power / RT Manager
Organic Monitoring
Consume Fade Trade & Negotiate
Policy
Energy Budget Manager Energy Budget Manager Local Energy Budget Local Energy Budget
P4 P3 P2 P1
Fill
Energy Input
Efficiency RT criteria Temperature Local traffic
Power source Peers (in neighbored OPCs) OPC
Voltage / frequency setting Power States Assigned Tasks Scheduled Tasks
data / information actions
Legend:
Future energy level Actual energy level Energy level Actual power state
Energy distribution: goals
Low energy consumption Avoidance of local thermal hot-spots
Energy distribution: main concept
Each OPC has a local energy budget Determines the local available energy Global power source Assigns energy budgets to OPCs (pulse-based) Energy budget manager Agent controlling local energy budget Receives temperature data each pulse Negotiates & trades energy budget with neighboring OPCs Influences power manager policies
9/16/2011 15 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Low Power Management: Agent Negotiation (Prof. Henkel)
Two types of agent negotiation based on simple economic principles Fully distributed [ICCAD 09]
Trading based on supply & demand Temperature incorporated as a negotiation penalty Agents only trade with their direct neighbors
Hierarchical [CODES 11]
Local agents make bids to higher level market agents Market agents make requests to global power source Local agent has income based on current temperature
Local Agents Global power souce Market Agents Local Agents
9/16/2011 16 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Thermal state of core classified as good, neutral, or bad Special case of monitoring state evaluation State determines current core power budget, i.e. local agent income dependent on state Budget trading resulting in state transitions to a “better” state are reinforced Shift of S1 & S2 to higher temperatures Trading resulting in “worse” state are penalized Shift of S1 & S2 to lower temperatures In good state, no suppressors are released to the AHS allowing other
- ptimizations (e.g. comm-
unication/performance) When approaching threshold in bad state, thermal supp- ressors become dominant
Organic Low Power Management:Thermal Management (Prof. Henkel)
9/16/2011 17 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Central approaches benefit from global knowledge Can achieve lowest peak temperatures But: do not scale and have central point of failure Distributed and hierarchical approaches both succeed in lowering peak temperature Hierarchical approach sacrifices some scalability in
- rder to achieve lower
temperature peaks
Organic Low Power Management: Peak Temperatures (Prof. Henkel)
9/16/2011 18 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Middleware: Artificial Hormone System (Prof. Brinkschulte)
OPC OPC OPC OPC OPC OPC OPC OPC
Task mapping Providing self-X properties: Self-configuration Self-healing Self-optimization Good mapping regarding Requirements of each task Relationships of the tasks Condition of each cell and it’s neighborhood Reacting and adapting to changes e.g. increased bit-rate errors Reaching stable mapping conditions Oscillation avoidance
9/16/2011 19 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Middleware: Implementation of the AHS for µCs (Prof. Brinkschulte)
AHS Library: AHSlib in pure ANSI C for deployment in environments from small µCs to large PCs Abstraction layers: “AHS Basic OS Support” – simple exchange of underlying OS “AHS Basic Communication Support” – easily interchangeable network layer and protocol
9/16/2011 20 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
AHS Interface AHS Interface AHS Task Management AHS Task Management AHS Error Management AHS Error Management AHS Message Communication AHS Message Communication AHS List Management AHS List Management AHS Hormone Communication AHS Hormone Communication AHS Log Management AHS Log Management AHS Basic OS Support AHS Basic OS Support AHS Basic Communication Support AHS Basic Communication Support
Distributed Application Operating System Communication System
AHSlib
AHS in Hardware: Simplified, light-weight implementation Self-synchronization method Less data to store Less computational work to do Perfect for running the AHS on a µC- platform or real hardware
AR
+ +
CR
Received Accelerators Received Suppressors
AR
+ +
Own Accelerators Own Suppressors
+ + + +
Own Eagervalue Accelerators - Suppressors
>
Received Eagervalues Modified Eagervalue
AR CR
>
Take Task AR : Accumulator Register CR : Cycle Register : Adder / Subtractor > : Comparator
Immune mechanisms for advanced self-healing and self-protecting aspects to increase robustness (together with the Organic Processing Cell capabilities and the Organic Monitoring) Robustness against mal-behaving internal/ external components (comparable to illness in a biological system) React to „ill“ OPCs Counter-measures against malicious attacks Global and local verification
9/16/2011 21 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Organic Middleware: Immune System (Prof. Brinkschulte)
Prototype: Interaction of HW, Mon, MW and TM
HW-Interfaces: Networking via HW- Interface Config-Manager via HW-Interface Energy Budget via Interface MW/Mon System Status MW/Thermal Developed Protocols: Output Ranges Packet Formats
DCM Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Datenpfad Module1 Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich artNoC- Router Virtual-ICAP- Interface UART DCM DCM DCM clk_dp2 clk_dp1 clk_dp3 DCM clk_io clk_net Network- Interface
Datenpfad- Zelle IO- Zelle
I C A P
Virtex2 FPGA OPC
artNoC M B M B M B M B M B M B
DCM Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Datenpfad Module1 Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich artNoC- Router Virtual-ICAP- Interface UART DCM DCM DCM clk_dp2 clk_dp1 clk_dp3 DCM clk_io clk_net Network- InterfaceDatenpfad- Zelle IO- Zelle
I C A PVirtex2 FPGA OPC
artNoC M B M B M B M B M B M B9/16/2011 22 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Demonstrator Hardware Floorplan
DCM Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Datenpfad Module1 Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich Rekonfigurierbarer Bereich Config- Manager Power- Manager State- Interface Network- Interface artNoC- Router Statischer Bereich artNoC- Router Virtual-ICAP- Interface UART DCM DCM DCM clk_dp2 clk_dp1 clk_dp3 DCM clk_io clk_net Network- Interface
Datenpfad- Zelle IO- Zelle
I C A P
Virtex2 FPGA OPC
artNoC M B M B M B M B M B M B
Microblaze OPC Empty- OPC I/O- OPC design flat; except OPC reconfigurable data path
9/16/2011 23 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Coordination of the Decentralized Control Loops
9/16/2011 24 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Brain Level Organ Level Cell Level
Myo- cardial Cell Nervous System Heart Application API Application Monitoring Hardware Monitoring Middleware, Monitoring, Feedback Hormone Level Computation Dynamic Power Manage- ment, Thermal negotiation Temperature, Local Traffic
Cell-House- keeping Loop Cell-House- keeping Loop Intra- Organ Loop Intra- Organ Loop Neighbor- Cell Loop Neighbor- Cell Loop Inter- Cell Loop Inter- Cell Loop Intra-Cell Loop Intra-Cell Loop
Application Testbed (all groups) Organic Monitoring System (Karl) Organic Processing Cells (Becker) Agent-Based Organic Thermal Management (Henkel) Organic Middleware (Brinkschulte)
System Loop
DodOrg Demonstrator Scenarios
OPC OPC OPC OPC OPC OPC OPC OPC
Providing self-X properties: Self-configuration Self-healing Self-optimization
9/16/2011 25 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Neighbor- Cell Loop Neighbor- Cell Loop Inter- Cell Loop Inter- Cell Loop Intra-Cell Loop Intra-Cell Loop Thermal Management Application Testbed Intra- Organ Loop Intra- Organ Loop Organic Monitoring System Organic Middleware Cell-House- keeping Loop Cell-House- keeping Loop Organic Processing Cells
DodOrg Demonstrator Scenarios
OPC OPC OPC OPC OPC OPC OPC OPC
Providing self-X properties: Self-configuration Self-healing Self-optimization
26 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011 26
Neighbor- Cell Loop Neighbor- Cell Loop Inter- Cell Loop Inter- Cell Loop Intra-Cell Loop Intra-Cell Loop Thermal Management Application Testbed Intra- Organ Loop Intra- Organ Loop Organic Monitoring System Organic Middleware Cell-House- keeping Loop Cell-House- keeping Loop Organic Processing Cells
Neighbor- Cell Loop Neighbor- Cell Loop Inter- Cell Loop Inter- Cell Loop Intra-Cell Loop Intra-Cell Loop Thermal Management Application Testbed Intra- Organ Loop Intra- Organ Loop Organic Monitoring System Organic Middleware Cell-House- keeping Loop Cell-House- keeping Loop Organic Processing Cells
Providing self-X properties: Self-configuration Self-healing Self-optimization
DodOrg Demonstrator Scenarios
OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC
27 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011 27 27
Neighbor- Cell Loop Neighbor- Cell Loop Inter- Cell Loop Inter- Cell Loop Intra-Cell Loop Intra-Cell Loop Thermal Management Application Testbed Intra- Organ Loop Intra- Organ Loop Organic Monitoring System Organic Middleware Cell-House- keeping Loop Cell-House- keeping Loop Organic Processing Cells
Providing self-X properties: Self-configuration Self-healing Self-optimization
DodOrg Demonstrator Scenarios
OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC OPC
28 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011 9/16/2011
Conclusion
DodOrg: New scalable system design Biologically inspired, reconfigurable many- core computer architecture Adaptation through decentralized control loops Modular, re-usable system components Autonomous Robustness and plasticity (dynamic stability) Provides the foundation for research in the area of self-organizing computing systems Motivates further and future research
9/16/2011 29 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
Brain Level Organ Level Cell Level
Myo- cardial Cell Nervous System Heart
Questions?
Thank you for your attention!
Application Testbed (all groups) Organic Middleware (Brinkschulte) Organic Processing Cells (Becker) Organic Low Power Management (Henkel) Organic Monitoring System (Karl)
9/16/2011 30 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
List of Publications:
- D. Kramer, R. Buchty, and W. Karl, “A Scalable and Decentral
Approach to Sustained System Monitoring“, ACACES,2009
- R. Buchty and W. Karl, “Design Aspects for Self-Organizing
Heterogeneous Multi-Core Architectures“, IT - Information Technology Journal 5/08, 2008
- R. Buchty, D. Kramer, and W. Karl, “An Organic Computing
Approach to Sustained Real-time Monitoring“, BICC08, 2008
- R. Buchty, O. Mattes, and W. Karl, “Self-aware Memory:
Managing Distributed Memory in an Autonomous Multi- master Environment,“ ARCS, 2008
- R. Buchty and W. Karl, A Monitoring) “Infrastructure for the
Digital on-demand Computing Organism (DodOrg)“, IWSOS, 2006
- Hans-Peter Löb, Rainer Buchty, Wolfgang Karl, “A Network
Agent for Diagnosis and Analysis of Real-time Ethernet Networks“, CASES, 2006
- U. Brinkschulte and A. von Renteln, “Analyzing the Behavior
- f an Artificial Hormone System for Task Allocation”, ICATC,
2009
- U. Brinkschulte , A. von Renteln, and M. Weiss, “Examining
Task Distribution by an artificial hormone system based middleware”, ISORC, 2008
- U. Brinkschulte, M. Pacher and A. von Renteln, “An Artificial
Hormone System for Self-Organizing Real-Time Task Allocation”, in Organic Computing, 2007
- U. Brinkschulte, A. von Renteln, and M. Pacher, “Reliability of
an Artificial Hormone System with Self-X Properties”, PDCS, 2007
9/16/2011 31 SPP 1183 OC Kolloquium – Nürnberg, 15.-16. September 2011
- T. Ebi, M. A. Al Faruque, and J. Henkel, “TAPE: Thermal-
aware Agent-based Power Economy for Multi/Many-Core Architectures”, ICCAD 2009
- T.Ebi, D. Kramer, W. Karl, and J. Henkel, “Economic
Learning for Thermal-aware Power Budgeting in Many- core Architectures” , CODES 2011
- M. Shafique, L. Bauer, and J. Henkel, “REMiS: Run-time
Energy Minimization Scheme in a Reconfigurable Processor with Dynamic Power-Gated Instruction Set” , ICCAD 2009
- M. A. Al Faruque, R. Krist, J. Henkel: ”ADAM: Run-time
Agent-based Distributed Application Mapping for on-chip Communication", DAC 2008
- C. Schuck, B. Haetzer, and J. Becker, “An Interface for a
Decentralized 2d-Reconfiguration on Xilinx Virtex-FPGAs for Organic Computing“, ReCoSoC, 2008
- C. Schuck, M. Kuehnle, M. Huebner, and J. Becker, “A
framework for dynamic 2D placement on FPGAs“ , IPDPS, 2008
- C. Schuck, S. Lamparth, J. and Becker, ”artNoC - A Novel
Multi-Functional Router Architecture for Organic Computing”, FPL, 2007