1
Distributed Embedded System Architecture
Philip Koopman koopman@cmu.edu July 12, 2002 &
Electrical Computer
ENGINEERING
Institute for Complex Engineered Systems
MPSOC 2002
Distributed Embedded System Architecture Philip Koopman - - PDF document
MPSOC 2002 Distributed Embedded System Architecture Philip Koopman koopman@cmu.edu July 12, 2002 Institute for Complex & Electrical Computer Engineered ENGINEERING Systems My Perspective On (Distributed) Embedded Systems 1
1
Philip Koopman koopman@cmu.edu July 12, 2002 &
Electrical Computer
ENGINEERING
Institute for Complex Engineered Systems
MPSOC 2002
2
°3 °3 °3 °3
◆ Embedded System Architecture =
+Control + other stuff
the system
◆ Make it easier for system to meet requirements
°4 °4 °4 °4
◆ Only “toy” versions are trivial; real world is complex
3
°5 °5 °5 °5
◆ “Features”
◆ Software
– Control loops – Finite state machines
– Intra-node communication via calls – Inter-node communication via messages ◆ Hardware
◆ Must meet non-functional requirements
(real-time, ’ilities including profitability)
°6 °6 °6 °6
◆ Loosely: an architecture is how all the pieces fit together ◆ Architecture definitions:
The structure – in terms of components, connections, and constraints – of a product, process, or element. [Rechtin96]
The structure or structures of the system, which comprise components, their externally-visible behavior, and the relationships among them [Bass97]
◆ Informally: Boxes and Arrows
4
°7 °7 °7 °7
◆ An architecture is an organized collection of components
that describes:
» (boxes & arrows)
» (rule for when to create a set of subsystem boxes)
» (rules to evaluate how good the architecture is)
and and interface for a component (it’s an instantiation of an architecture)
◆ One person’s component is another person’s system
°8 °8 °8 °8
◆ Functional properties
◆ Control properties
◆ Temporal properties
◆ Data properties
message
◆ The big question – how do you know where to insert the
interfaces?
5
°9 °9 °9 °9
◆ Primary Architectures (almost always used)
(CPU, memory, network, I/O)
(software components, data repositories, message dictionary, external interfaces)
(hierarchy of control algorithms; emergent system behavior)
◆ Secondary Architectures (used when needed)
°10 °10 °10 °10
◆ Partition to meet constraints of:
◆ Traditional approach: hardware first
◆ Alternatives are possible
strategy
6
°11 °11 °11 °11
◆ General known approaches can apply to new systems
◆ Following slides are some examples
do things!
– The idea is to demonstrate the different flavors of architectural views
°12 °12 °12 °12
◆ Centralized System
A S A S CPU S A S A A A S
7
°13 °13 °13 °13
◆ Ad Hoc
CPU A S A S CPU S A A A S CPU A A A CPU S A A S
°14 °14 °14 °14
◆ Hierarchical
I/O at bottom
reliability CPU A S A S CPU S A S CPU S A A A A S CPU CPU S S A
8
°15 °15 °15 °15
◆ Federated/Decentralized Networked System
– Often sensor/actuator/CPU pairing done by 3-D geometric regions – Design approach is often add CPUs as you need more I/O connections
CPU A S A S CPU S A S CPU S A A A A S
°16 °16 °16 °16
◆ Highly Distributed Networked System
One sensor, actuator, or servo pair per CPU, on a network
– Bus hierarchy may be needed to overcome bandwidth limits
– Good for an idealized MEMS system
CPU CPU CPU A S A S CPU S A CPU S CPU S A
9
°17 °17 °17 °17
◆ Ad Hoc (with “object-oriented” meatballs)
°18 °18 °18 °18
◆ Client/Server
All data at a server; replicate clients to interface elsewhere
SERVER CLIENT CLIENT CLIENT DATA
10
°19 °19 °19 °19
◆ Object oriented / Federated
methods
– Note: flow of control is completely obscured
(compatible with CORBA)
METHODS DATA OBJECT "BUS" METHODS DATA METHODS DATA
°20 °20 °20 °20
◆ Table Driven, phased, flow of control
to specify detailed behavior for general software modules
– This is actually a combination of “control flow” and “table driven” patterns
PHASE 1 TABLE 1 PHASE 2 TABLE 2 INIT FINISH
11
°21 °21 °21 °21
◆ Master/Slave
Master is potential single point of failure
SLAVE MASTER
POLL RESPONSE
SLAVE
POLL RESPONSE
ROUND ROBIN POLLING
°22 °22 °22 °22
◆ Global priority
– Does NOT require a physical node to act as a queue – fully distributed implementations are commonly used!
NODE NODE NODE NODE PRIORITY QUEUE
12
°23 °23 °23 °23
◆ Intelligent Hierarchical Control (IHC)
sensors/actuators
– Use sub-levels as logical sensors & actuators to close a control loop – Each level may itself have sub-levels
CONTROL A S A S CONTROL S A S CONTROL S A A A A S CONTROL CONTROL S S A "S" "A" "S" "S"
°24 °24 °24 °24
◆ Federated Agents/“Blackboard”
monitor and transmit global state information for coordination
AGENT A S A S AGENT S A S AGENT S A A A A S "BLACKBOARD" SHARED GLOBAL STATE INFORMATION
13
°25 °25 °25 °25
◆ State machine model
– This is a classic usability problem ◆ Menu-driven interface
◆ Command line interface
°26 °26 °26 °26
◆ Direct integration
◆ “Basic” middleware
– May facilitate use of COTS software components
◆ Advanced middleware
14
°27 °27 °27 °27
◆ Automatic safety net approach
– E.g., emergency brake, or other emergency stop system
◆ Rely on human operator to keep system safe
– But, operators are great scapegoats for the accident investigation ◆ Field data collection + engineering feedback
◆ There are non-architectural approaches as well
into other architectural views
°28 °28 °28 °28
◆ “Air Gap” security
◆ Firewall security
◆ Encrypted communication/authentication
firewall trusted zones)
◆ Non-architectural approaches include:
15
°29 °29 °29 °29
◆ Segregate critical subsystems and recertify only those
◆ Include access points for testing
– But it is tricky to make an API truly bulletproof ◆ Non-architectural approaches:
– In some cases this really works (e.g., keep below certain wattage for RF transmissions) – “Certification” in that case is being sure you followed the design rules
°30 °30 °30 °30
◆ Software upgrade capability
– Cost vs. flexibility tradeoff – Upgrades can occur between IC manufacturing and product assembly
◆ Mechanically partitioned units (e.g., socketed chips)
– Replace subsystems to accomplish upgrades/repairs
upgrade maintenance operation
– Can be difficult to accomplish inexpensively if each chip is highly integrated (and therefore expensive) ◆ Non-architectural approaches include:
16
°31 °31 °31 °31
◆ Replication with failover
– Active replication with hot standby failover – Passive replication with cold standby + transaction logs for catching up – Spare resource pool with reboot after reconfiguration
◆ Function/load shedding as replicants fail
– As units fail, capacity is reduced, but each unit can operate standalone if needed
– As units fail, different mappings are used to keep key functions running
°32 °32 °32 °32
◆ Every real system has several architectural views
hardware, but can have much higher dimensionality
◆ Most times you can use any architectural combination
CPU A S A S CPU S A A A S CPU A A A CPU S A A S CONTROL A S A S CONTROL S A S CONTROL S A A A A S CONTROL CONTROL S S A "S" "A" "S" "S"
Point-to-Point Hardware Hierarchical Control
17
°33 °33 °33 °33
◆ Some patterns are
isomorphic across different architectural perspectives
used together
because they are isomorphic does not mean they aren’t all there as distinct concepts!
CPU A S A S CPU S A S CPU S A A A A S METHODS DATA OBJECT "BUS" METHODS DATA METHODS DATA AGENT A S A S AGENT S A S AGENT S A A A A S "BLACKBOARD" SHARED GLOBAL STATE INFORMATION
Federated Hardware Object Oriented Software Federated Control
°34 °34 °34 °34
◆ Multiple architectural approaches can be
combined/nested
PLUS some “objects” are implemented as distributed systems
◆ There are no exactly correct answers
– Architectural selections are not entirely independent – Tradeoffs can occur due to combinations of patterns ◆ Businesses are systems too
18
°35 °35 °35 °35
◆ Where do all those “non-architectural” approaches fit?
– e.g., “we don’t have a security strategy”
– e.g., information access architecture uses an NDA in support of “security through
– e.g., “we want consumers to upgrade by throwing the old one away”
» Thus, make products non-repairable, but cheaper than repairable ones » Perhaps it consumers encounter a bug, tell them their unit has worn out and they need to buy another one to replace it (one that will have newer software…)
◆ Most “systems” are really “systems of systems”
components (this is a traceability problem)
°36 °36 °36 °36
Note: this is a combined view, 1-D approach to architecture
◆ Functional Architecture = subsystems created by splitting
“functions”
federated communication architecture (1 “box” = 1 “subsystem”)
◆ Architectural methodology (a guide to “Functional Boxology”)
– Associate secondary mission goals
– One verb per requirement – Be sure that verbs are orthogonal
– Recurse as necessary – Stop recursing when each box is a design team of 4 people or fewer
19
°37 °37 °37 °37
Primary Mission Secondary Missions Provide safe, timely, comfortable passage between floors. Deliver Passengers Quickly Inform Users Conform To Building Codes Protect Passengers Support Customized Behavior
Example Functional Architecture for Elevator
Provide Tranquil Environment Support Maintenance MOVE ENSURE SAFETY CONTROL ACCESS INFORM USERS DISPATCH DETERMINE PASSENGER INTENT SET MODES
TOP-LEVEL FUNCTIONS
CONTROL ACCESS CONTROL CAR ACCESS CONTROL HOISTWAY ACCESS DEAL WITH DOOR OBSTRUCTIONS LOAD CLOSE REOPEN FOR LOADING REOPEN FOR UNLOADING UNLOAD OPEN HOISTWAY FOR MAINTENANCE REVERSE DOOR SET DWELL TIME PROVIDE PASSENGER PROMPTING DISPLAY DESTINATION FOR LOADING DISPLAY FLOOR FOR UNLOADING SET MODES UP-PEAK DOWN- PEAK "NORMAL" FIRE RECALL FIRE OPERATION
20
ENSURE SAFETY MONITOR SAFETY ALARM ENTER SAFE MODE (SHUTDOWN) HALL DOORS CLOSED VELOCITY HOISTWAY LIMITS DOORWAY OBSTRUCTION CAR DOORS CLOSED NEAR HOISTWAY LIMITS DOORWAY NOT CLEAR OVERSPEED HIGH VELOCITY PERSISTANT DOOR BLOCK HALL DOOR OPEN CAR DOOR OPEN TRAPPED PASSENGER MOVE FOLLOW ACCELERATION PROFILE LEVEL WITH TARGET FLOOR SPEED UP SLOW DOWN STOP LEVEL RE-LEVEL MAX. SPEED
DETERMINE PASSENGER INTENT DETERMINE INITIAL DESTINATION DETERMINE FINAL DESTINATION CORRECT MISTAKEN INTENT DETERMINE START FLOOR DETERMINE # TO ENTER CAR DETERMINE INTENDED DESTINATION TOO MANY ON/OFF TOO FEW ON/OFF DETERMINE DESTINIATION FLOOR DETERMINE # TO EXIT CAR PASSENGER CHANGES MIND DETECT MISCHIEF DISPATCH ESTIMATE PASSENGER LOCATIONS TRACK REQUIRED STOPS COMPUTE NEXT STOP FLOOR & DIRECTION ESTIMATE FLOOR POPULATIONS ESTIMATE EXPECTED NEAR-TERM NEW CALLS ESTIMATE IN-CAR POPULATION PLAN OPTIMAL PATH DETERMINE "GOING UP/ DOWN" WHICH FLOOR STOPS/ DIRECTIONS WHICH CAR STOPS DETERMINE NEXT FLOOR
21
INFORM USERS INFORM PASSENGERS INFORM BUILDING MANAGERS INFORM MAINTAINERS ESTIMATE TIME TO CAR ARRIVAL ESTIMATE TIME LEFT TO RIDE REASSURE PASSENGER PICKUP WILL HAPPEN DISPLAY EFFICIENCY DISPLAY OPERATIONAL STATUS PROVIDE INFORMATION FOR OTHER BUILDING SUBSYSTEMS DIAGNOSIS PROGNOSIS SELF-TEST REASSURE PASSENGER DROPOFF WILL HAPPEN TIME FOR PERIODIC MAINTENANCE
CONTROL ACCESS CONTROL CAR ACCESS CONTROL HOISTWAY ACCESS DEAL WITH DOOR OBSTRUCTIONS LOAD CLOSE REOPEN FOR LOADING REOPEN FOR UNLOADING UNLOAD OPEN HOISTWAY FOR MAINTENANCE REVERSE DOOR SET DWELL TIME PROVIDE PASSENGER PROMPTING DISPLAY DESTINATION FOR LOADING DISPLAY FLOOR FOR UNLOADING SET MODES UP-PEAK DOWN- PEAK "NORMAL" FIRE RECALL FIRE OPERATION
22
°43 °43 °43 °43
RoSES = Robust Self-configuring Embedded Systems
◆ Research Context:
fine grain distributed embedded systems
◆ Research vision:
Product families + auto-reconfiguration =
◆ Potential Impact:
◆ What we’re really learning is where all
the difficult research issues are!
System Variables/Network
Baseline Sensor SW Functionality Dynamic Interface to Object Bus Basic S/A Device
Smart Sensors/Actuators
Local CPU & Memory Adapter Repository
CUSTOMIZATION MANAGER
SW Adapter for High Level Logical Interface SW Compute/ Control Functions
°44 °44 °44 °44
Some Specification & Evaluation Research Issues
– Problem: given fixed resources, how to you maximize utility? – What baseline set of components gives most reconfiguration flexibility?
– Product family architecture specification – Specification of utility for different features & feature sets – When/how to determine HW/SW/Mechanical/Business tradeoffs
– Is a system really “working” when it is partially disabled? – Safety/certification of component-based systems with many failure modes
– Many real embedded systems have global modes that break design methods
» Do you do a distinct system design for each mode and merge?
– Many real systems are hybrid discrete+continuous
– Software runtime infrastructure (Jini was a poor fit to an embedded network) – Real time scheduling for distributed networked system – Security of embedded+enterprise combined system
23
°45 °45 °45 °45
◆ How do we know which architecture to use and when?
degradation in the abstract?
there is more than just one possibility
◆ Can system architects be trained, or must they be born?
– Or is good enough really good enough?
°46 °46 °46 °46
◆ System Architecture via patterns for multiple system views
– Hardware + Software + Communication + Control + others
– Be constrained to a 1-D/low-D decomposition (e.g., functional architecture) vs. – Deal with allocation incompatibilities when fusing a many-D decomposition
– System-level tradeoffs between mechanical, HW, SW, and other implementation methods are common – Existence of non-architectural options mean some tradeoffs happen between technical and business/non-technical system layers! ◆ Functional architecture: yes, there is a multi-view recipe!