HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 Overview - - PowerPoint PPT Presentation

hw sw codesign w fpgas embedded systems ece 495 595
SMART_READER_LITE
LIVE PREVIEW

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 Overview - - PowerPoint PPT Presentation

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 Overview (Slides from Embedded Systems Design, F. Vahid and T. Givargis) Embedded computing systems definition: Computing systems embedded within electronic devices. Hard to define


slide-1
SLIDE 1

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 1 (2/3/09) Overview (Slides from Embedded Systems Design, F. Vahid and T. Givargis) Embedded computing systems definition:

  • Computing systems embedded within electronic devices.
  • Hard to define -- nearly any computing system other than a desktop computer.
  • Billions of units produced yearly, versus millions of desktop units.
  • Perhaps 50 per household and per automobile.

Characteristics of Embedded Systems

  • Single-functioned

Executes a single program, repeatedly.

  • Tightly-constrained

Low cost, low power, small, fast, etc.

  • Reactive and real-time

Continually reacts to changes in the system’s environment. Must compute certain results in real-time without delay. For example, a car’s cruise controller must monitor and react to speed and brake sensors.

slide-2
SLIDE 2

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 2 (2/3/09) Embedded System Example A digital camera chip: Single-functioned, tightly-constrained (low cost/power/small), not real-time. CCD lens A/D CCD preprocessor A/D Pixel coprocessor JPEG codec Microcontroller Mult/Accum DMA controller

  • Mem. controller

Display Ctrl LCD Ctrl UART ISA bus interface

slide-3
SLIDE 3

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 3 (2/3/09) Embedded System Challenges The design challenge for an engineer is to simultaneously optimize a set of possibly conflicting design metrics. A design metric is a measurable feature of a system’s implementation

  • NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of

designing the system.

  • Unit cost: the monetary cost of manufacturing each copy of the system, excluding

NRE cost.

  • Size: the physical space required by the system, e.g., bytes and gates.
  • Performance: the execution time or throughput of the system.
  • Power: the amount of power consumed by the system, for battery life/cooling req.
  • Flexibility: the ability to change the functionality of the system without incurring

heavy NRE cost.

  • Time-to-prototype: the time needed to build a working version of the system.
  • Time-to-market: the time required to develop a system to the point that it can be

released and sold to customers.

  • Maintainability: the ability to modify the system after its initial release (by others).
  • Correctness, safety, etc...
slide-4
SLIDE 4

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 4 (2/3/09) Embedded System Challenges Improving one design metric one may worsen others Designers must be an expert with both software and hardware. They must be comfortable with various technologies (and moving between them) in order to choose the best for a given application and constraints. Time-to-Market Design Metric Market window: Period during which the product would have highest sales. Challenge: Growing system complexities, driven by increased IC capacities, requires designers to do more in less time. Power Size NRE cost Performance

slide-5
SLIDE 5

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 5 (2/3/09) Embedded System Design Metrics Time-to-Market Design Metric Delays can be costly: % revenue lost = ((On-time - Delayed)/On-time)*100

  • On-time = 1/2 * 2W * W = W2 (assumes market rise is at 45 degree angle)
  • Delayed = 1/2 * (W-D+W)*(W-D)
  • Percentage revenue loss = (D(3W-D)/2W2)*100%

E.g.: Lifetime 2W = 52 wks, delay D = 10 wks -> (10*(3*26 - 10)/2*262) = 50%! Market rise On-time entry Peak revenue On-time Delayed W D Delayed entry Time 2W Peak revenue from delayed entry Simplified revenue model Product life: 2W, peak at W Area of triangle equals revenue Area difference is revenue loss Market rise Market fall

slide-6
SLIDE 6

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 6 (2/3/09) Embedded System Design Metrics NRE and Unit Cost Metrics: Unit cost: the monetary cost of manufacturing each copy of the system, exclud- ing NRE cost. NRE cost: The one-time monetary cost of designing the system Total cost = NRE cost + unit cost * num_units Per-product cost = total cost / num_units = (NRE cost/num_units) + unit cost Example: NRE = $2000, unit = $100 For 10 units: Total cost = $2000 + 10*$100 = $3000 Per-product cost = $2000/10 + $100 = $300 Amortizing NRE cost over the units results in an additional $200 per unit! When comparing technologies by costs, the best option depends on quantity Technology A: NRE=$2,000, unit=$100 Technology B: NRE=$30,000, unit=$30

slide-7
SLIDE 7

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 7 (2/3/09) Embedded System Design Metrics NRE and Unit Cost Metrics: So both per-product cost and time-to-market must be considered to determine revenue impact. Performance Metric:

  • Clock frequency, instructions per second

Commonly used but most widely abused measure. For example, for a digital camera example, the user cares about how fast it processes images, not about the clock speed or instructions per second. $160 $120 $80 $40 $0 Per product cost Number of units (volume) 800 1600 2400 Low NRE Low unit cost Approach per-product cost at high volumes

slide-8
SLIDE 8

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 8 (2/3/09) Embedded System Design Metrics Performance Metric: The two main measures of performance are:

  • Latency (response time)

Time between task start and end, e.g., Camera can process images in 0.25 seconds

  • Throughput

Tasks processed per second Note that throughput is NOT always the number of tasks * latency because

  • f pipelining

Camera may be able to process 8 images/sec. by capturing a new image while previous image is being stored. Three Key Embedded System Technologies Technology defined: A manner of accomplishing a task, especially using technical processes, methods, or knowledge

slide-9
SLIDE 9

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 9 (2/3/09) Embedded System Technologies Three key technologies for embedded systems

  • Processor technology
  • IC technology
  • Design technology

Processor Technology Relates to the architecture of the computation engine used to implement the system’s functionality. The processor does not have to be programmable. Other non-programmable digital systems can be considered processors as well. Processors can be specialized for implementing a specific function, such as image processing or compression. A system may be composed of a collection of specialized processors to opti- mize design metrics for the application, e.g., digital camera.

slide-10
SLIDE 10

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 10 (2/3/09) Embedded System Technologies Processor Technology General-Purpose Processors: Program memory and data path are generic -- can execute any program. Controller Control logic and state regs. IR PC Program memory Assembly code for: total = 0 for i = 1 to ... Datapath Reg. file ALU Data memory General General Purpose Controller Control logic and state regs. IR PC Program memory Assembly code for: total = 0 for i = 1 to ... Datapath Reg. file ALU Data memory Custom Application-Specific Single-Purpose Controller Control logic state regs. Datapath index total + Data memory

slide-11
SLIDE 11

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 11 (2/3/09) Embedded System Technologies Processor Technology General-Purpose Processors: Data path typically has a large register file and one or more general purpose ALUs Embedded system designer does NOT need to be concerned with the design

  • f the processor, he/she simply installs a program into memory.

Benefits: time-to-market is low, flexibility is high, unit cost can be low in small quantities (NRE cost distributed over other customers), and perfor- mance high when using cutting edge technologies. Drawbacks: Unit cost can be high in large quantities (custom processor can be designed with lower NRE costs), performance can be low for certain apps, and size/power can be large because of unneeded features.

slide-12
SLIDE 12

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 12 (2/3/09) Embedded System Technologies Processor Technology Single-Purpose Processors: Hardware designed to execute exactly one program using a custom digital circuit -- commonly referred to as coprocessor, accelerator and peripheral. From camera example, all components except for microcontroller are sin- gle-purpose processors. JPEG codec compresses and decompresses video frames. Features: Circuit contains only components needed to execute a single pro- gram and no program memory is required. Benefits: Unit cost may be low in large quantities, performance can be high, size/power can be small (inverse of GPP). Drawbacks: Design time and NRE cost may be high, flexibility low and unit cost high for small quantities.

slide-13
SLIDE 13

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 13 (2/3/09) Embedded System Technologies Processor Technology Application Specific Processors: ASIPs are a programmable processor optimized for a particular class of applications -- compromise between GPP and single-purpose processors. Examples include microcontrollers and digital-signal processors. Microcontrollers are optimized for embedded control apps, where monitor- ing and setting of numerous single-bit control signals are common. Datapath is optimized for application class by adding special functional units for common operations and eliminating infrequently used units. Benefits: flexibility can be high while achieving good performance, power and size. Drawbacks: NRE costs can be high both in circuit and compiler design, some inefficiency because of features to support reprogrammability.

slide-14
SLIDE 14

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 14 (2/3/09) Embedded System Technologies IC Technology The manner in which a digital (gate-level) implementation is mapped onto a chip. IC technology is independent of processor technology, i.e., any type of proces- sor can be mapped to any type of IC technology. A chip is fabricated using a sequence processing steps, with transistors fabri- cated in the substrate and metal wires above. GPP PLD Semi-custom Full-custom ASIP Single purpose processor Flexibility Maintainability NRE cost Time-to-market Cost (low volume) Power efficiency Performance Size Cost (high volume)

slide-15
SLIDE 15

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 15 (2/3/09) Embedded System Technologies IC Technology: Full Custom/VLSI In full custom design, all layers of chip fabrication are optimized. Transistors are placed to minimize wire lengths, and are sized to optimize delay characteristics. Once a chip layout is completed, the mask specification is sent off to be fabri- cated. VLSI chip design has a very large NRE cost and long turnaround times (measured in months). VLSI chip design yields excellent performance, and small power and size. Usually used only in high-volume or performance critical applications. Our VLSI course describes this process in detail. Very sophisticated tools exist to enable layout and simulation.

slide-16
SLIDE 16

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 16 (2/3/09) Embedded System Technologies IC Technology: Semi-custom ASIC (gate array and standard cell) For ASIC, the lower layers are fully or partially built, leaving the upper layers to be customized for the application. For gate array ASICs, the masks for the transistor and gate levels are already built -- the chip already consists of arrays of gates. All that remains is to define the interconnections of the gates. For standard cell ASICs, a set of logic-level cells, e.g., NAND and NOR, are hand designed. Place and route tools automatically configure the gates and interconnect to implement the design. ASICs are the most popular IC technology. ASICs provide good performance and size, lower NRE cost (w.r.t. full custom). Drawback: Still require weeks to months to manufacture.

slide-17
SLIDE 17

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 17 (2/3/09) Embedded System Technologies IC Technology: PLD For programmable logic devices, all layers already exist. PLDs can be partitioned into two categories, simple and complex. PLAs, PALs and GALs are simple. They are programmed by configuring AND and/or OR gates. CPLDs and FPGAs are complex and implement a more sophisticated con- nection scheme between gates. PLDs offer a low NRE cost, are instantly available and are excellent for proto- typing. Drawbacks: They are larger than ASICs, have higher unit cost, may consume more power and have lower performance.

slide-18
SLIDE 18

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 18 (2/3/09) Embedded System Technologies Design Technology: Refers to the methods used to convert our concept of desired system functional- ity into an implementation. We must be able to implement the design quickly and the implementation must

  • ptimize design metrics.

A key objective of improving design technology is to enhance productivity of the designer to keep pace with Moores law. The number of transistors on an IC doubles every 18 months. System spec. Behavior spec. RTL spec. logic spec. Top down design process System refined through several abstraction levels Three approaches to improving the design process for increased productivity: Compilation/synthesis Libraries/IP Test/verification

slide-19
SLIDE 19

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 19 (2/3/09) Embedded System Technologies Design Technology: Compilation/Synthesis: Desired functionality is described in abstract manner and lower-level imple- mentation details are automatically generated. Removal of the details improves productivity by as much as an order of magnitude.

  • Logic synthesis: converts Boolean expressions into a connection of logic gates

(netlist).

  • RTL synthesis: converts FSM and register transfers into a datapath of RT com-

ponents and a controller of Boolean equations.

  • Behavioral synthesis: converts a sequential program into FSMs and register

transfers.

  • Compiler: converts a sequential program to assembly code (RTL).
  • System synthesis: converts an abstract system spec. into a set of sequential pro-

grams on general- and single-purpose processors.

slide-20
SLIDE 20

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 20 (2/3/09) Embedded System Technologies Design Technology: Libraries/IP: Libraries allow reuse of pre-existing implementations, and therefore improve productivity. IP cores are a good example.

  • Logic libraries: consist of layouts of gates and cells
  • RT-library: consists of layouts of registers, MUXs, decoders and functional

units.

  • Behavioral library: consists of compression components, bus interfaces, display

controllers and GPUs.

  • System-level library: consists of complex systems solving particular problems,

such as an interconnection of processors with OS and programs to implement an ethernet protocol.

slide-21
SLIDE 21

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 21 (2/3/09) Embedded System Technologies Design Technology: Test/Verification: Ensures that functionality is correct, in an effort to avoid time-consuming debugging and iterations from low abstractions to higher levels. Simulation is the most common form of verification. Simulation exists at every abstraction level, from gate-level to GPP simula- tors that execute machine code, to cosimulators that connect HDL and GPP simulators. At the system level, model simulators simulate the initial system spec. using an abstract computation model, independent of processor tech. Also, model checkers verify certain properties of the specification, e.g., cer- tain simultaneous conditions never occur or the system does not deadlock.

slide-22
SLIDE 22

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 22 (2/3/09) Embedded System Technologies Design Technology: Standards also serve to improve productivity by defining well-defined methods and interfaces. They exist at the language, synthesis and library levels. Languages improve productivity by allowing designers to specify functionality with minimal effort. Frameworks improve productivity by providing a software environment for the application of numerous tools and for version management. 1981 100,000 10,000 1,000 100 10 1 0.1 0.01 1995 2010 Productivity Transistors/ designer-month (K)

slide-23
SLIDE 23

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 23 (2/3/09) Embedded System Technologies The combination of compilation/synthesis, libraries/IP, test/verification, standards, languages and frameworks have increased productivity dramatically. In 1981, a designer could produce about 100 transistors per month. In 2002, a designer could produce about 5,000 transistors per month. However, productivity has not kept pace with IC capacity. 1981 100,000 10,000 1,000 100 10 1 0.1 0.01 1995 2010 Productivity Transistors/designer-month (K) 10,000 1,000 100 10 1 0.1 0.01 0.001 GAP IC capacity Transistors per chip (M)

slide-24
SLIDE 24

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 24 (2/3/09) Embedded System Technologies GAP denotes the growing design productivity gap. For example, in 1981, a leading-edge chip required about 100 designer-months to design. 100 designer-months * 100 transistors/designer-month = 10,000 transistors. In 2002, a leading-edge chip required about 30,000 designer months! 30,000 designer-months * 5,000 transistors/designer-month = 150,000,000 tran- sistors. So the design productivity gap increased the design of a chip from 100 to 30,000. Assuming a designer costs about $10,000 per month, the cost has risen from $1,000,000 to $300,000,000! Few products can afford this, so most designs do not even come close to using the potential chip capacity.

slide-25
SLIDE 25

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 25 (2/3/09) Embedded System Technologies The situation is even worse, however, because this assumes that productivity is inde- pendent of project team size. Unfortunately, adding designers to a project team actually decreases productivity. This is true because the complexity of having more designers work together grows and this slows their individual productivity. "Too many cooks" can actually lengthen completion time. 60,000 50,000 40,000 30,000 20,000 10,000 Team productivity (transistors/month) number of designers 10 20 30 40 43 Months to completion 24 19 1615 16 18 23 Individual designer productivity Decreases as more designers are added

slide-26
SLIDE 26

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 ECE UNM 26 (2/3/09) The Codesign Latter The recent maturation of synthesis enables a unified view of hardware and software: Choosing between hardware or software for a particular function is simply a tradeoff among various design metrics. There is no difference between what hardware or software can implement. Sequential program code (e.g., C, VHDL) Compilers (60’s and 70’s) Assemblers, linkers (50’s and 60’s) Assembly instructions Machine instructions Behavioral Synthesis (90’s) Register transfers RT Synthesis (80’s and 90’s) RT Synthesis (80’s and 90’s) Logic Equations/FSMs Logic gates Microprocessor plus program bits: "software" Implementation VLSI, ASIC or PLD Implementation: "hardware"