ECE232: Hardware Organization and Design Lecture 29: Computer - - PowerPoint PPT Presentation

ece232 hardware organization and design
SMART_READER_LITE
LIVE PREVIEW

ECE232: Hardware Organization and Design Lecture 29: Computer - - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Announcements ECE Honors Exhibition Wednesday, April 30 3:00-4:00 PM M5


slide-1
SLIDE 1

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB

ECE232: Hardware Organization and Design

Lecture 29: Computer Input/Output

slide-2
SLIDE 2

ECE232: Computer I/O 2

Announcements

  • ECE Honors Exhibition
  • Wednesday, April 30
  • 3:00-4:00 PM
  • M5
  • SDP Demo Day
  • 10AM-2PM, Friday, April 25
  • Gunness Student Center
  • ECE Picnic (tickets in ECE office/Eliza)
  • 3-7PM, Friday, April 25
  • Hadley Young Men’s Club
  • ECE Banquet (tickets in ECE office/Eliza)
  • 6-9PM, Friday, May 2
  • Courtyard Marriott, Hadley
slide-3
SLIDE 3

ECE232: Computer I/O 3

Overview

  • Input and output are fundamental for computer operation
  • Typically much slower than computation
  • Two types of transfer
  • Polling – processor constantly checks for data
  • Interrupts – processor is interrupted from activity
  • Need to understand the requirements of data transfer
  • Tied to computer organization (bus, interfaces, etc)
  • I/O bandwidth is important (how fast, how much)
  • Most interfaces today are standardized (USB, monitor, Ethernet)
slide-4
SLIDE 4

ECE232: Computer I/O 4

Anatomy: 5 components of any Computer

Memory Devices Input Output Keyboard, Mouse Display, Printer Disk Processor Control Datapath

Processor Cache Memory - I/O Bus Main Memory I/O Controller Disk Disk I/O Controller I/O Controller Graphics Network

interrupts

slide-5
SLIDE 5

ECE232: Computer I/O 5

Handling IO

  • Users like to connect devices to their computers
  • Keyboard, mouse, printer…
  • External devices may require attention from processor at

unpredictable times

  • CPU doesn’t know when you’re about to hit a key
  • IO devices can be very fast or very slow
  • Need to have a flexible way to control all devices
slide-6
SLIDE 6

ECE232: Computer I/O 6

I/O Device Examples and Speeds

  • I/O Speed: bytes transferred per second

(from mouse to display: million-to-1) Device Behavior Partner Data Rate (Mbit/sec) Keyboard Input Human 0.0001 Mouse Input Human 0.0038 Laser Printer Output Human 3.2000 Magnetic Disk Storage Machine 240-2560 Modem I or O Machine 0.016-0.064 Network-LAN I or O Machine 100-1000 Graphics Display Output Human 800-8000

slide-7
SLIDE 7

ECE232: Computer I/O 7

Parallel ATA (100 MB/sec) Parallel ATA (100 MB/sec) (20 MB/sec) PCI bus (132 MB/sec) CSA (0.266 GB/sec) AGP 8X (2.1 GB/sec) Serial ATA (150 MB/sec) Disk Pentium 4 processor 1 Gbit Ethernet Memory controller hub (north bridge) 82875P Main memory DIMMs DDR 400 (3.2 GB/sec) DDR 400 (3.2 GB/sec) Serial ATA (150 MB/sec) Disk AC/97 (1 MB/sec) Stereo (surround- sound) USB 2.0 (60 MB/sec) . . . I/O controller hub (south bridge) 82801EB Graphics

  • utput

(266 MB/sec) System bus (800 MHz, 604 GB/sec) CD/DVD Tape 10/100 Mbit Ethernet

Hardware Solution (875 Chipset)

slide-8
SLIDE 8

ECE232: Computer I/O 8

Disk Device Terminology

  • Several platters, with information recorded magnetically on

both surfaces (usually)

  • Bits recorded in tracks, which in turn are divided into sectors

(e.g., 512 Bytes)

  • Actuator moves head (end of arm, 1/surface) over track

(“seek”), select surface, wait for sector rotate under head, then read or write

  • “Cylinder”: all tracks under heads

Platter Outer Track Inner Track Sector Head Arm Actuator

slide-9
SLIDE 9

ECE232: Computer I/O 9

Disk Device Performance

  • Disk Latency = Seek Time + Rotation Time + Transfer Time

+ Controller Overhead

  • Seek Time - depends on no. tracks arm moves, seek speed
  • Average no. tracks arm moves?
  • Sum all possible seek distances from all possible tracks /

total #

  • Assumes average seek distance is random
  • Disk industry standard benchmark
  • Rotation Time - depends on rotation speed, how far sector is

from head

  • 1/2 time of a rotation
  • Example: 7200 Revolutions Per Minute  120 Rev/sec
  • 1 revolution = 1/120 sec  8.33 milliseconds
  • 1/2 rotation (revolution)  4.16 ms
  • Transfer Time - depends on data rate (bandwidth) of disk

(bit density), size of request

slide-10
SLIDE 10

ECE232: Computer I/O 10

Disk Performance Model /Trends

  • Capacity
  • + 100%/year (2X/1 yr)
  • Transfer rate (BW)
  • + 40%/year (2X/2 yrs)
  • Rotation + Seek time
  • – 8%/year (1/2 in 10

yrs)

  • MB/$
  • > 100%/yr (2X/<1.5 yr)
slide-11
SLIDE 11

ECE232: Computer I/O 11

Disk Performance

  • Calculate time to read 1 sector (512B) for UltraStar 72 using

advertised performance; sector is on outer track

  • Disk latency = average seek time + average rotational delay

+ transfer time + controller overhead = 5.3 ms + 0.5 * 1/(10000 RPM) + 0.5 KB / (50 MB/s) + 0.15 ms = 5.3 + 3.0 + 0.01 + 0.15 ms = 8.46 ms

slide-12
SLIDE 12

ECE232: Computer I/O 12

Instruction Set Architecture for I/O

  • Some machines have special input and output instructions
  • Alternative model (used by MIPS):
  • Input: ~ reads a sequence of bytes
  • Output: ~ writes a sequence of bytes
  • Memory also a sequence of bytes, so use loads for input,

stores for output

  • Called “Memory Mapped Input/Output”
  • A portion of the address

space dedicated to communication paths to Input or Output devices (no memory there)

  • These addresses are not

regular memory, instead, they correspond to registers in I/O devices 0xFFFFFFFF 0xFFFF0000 cmd reg. data reg. address

slide-13
SLIDE 13

ECE232: Computer I/O 13

Memory Mapped IO

  • Make control registers and I/O device data registers appear

to be part of the system’s main memory

  • Reads and writes to the mapped region of the memory

are translated by memory controller hardware into accesses of hardware device

  • Makes it easy to support variable numbers/types of

devices – just map them onto different regions of memory

  • Accessing I/O device registers and memory can be done by

accessing data structures via the device pointers

  • Most device drivers are now written in C/C++.

Memory mapped I/O makes this feasible without any changes to the way a CPU is programmed

slide-14
SLIDE 14

ECE232: Computer I/O 14

Processor-I/O Speed Mismatch

  • 1 GHz microprocessor can execute 1000 million load or

store instructions per second, or 4 million KB/s data rate

  • I/O devices from 0.01 KB/s to 30,000 KB/s
  • Input: device may not be ready to send data as fast as the

processor loads it

  • Also, might be waiting for human to act
  • Output: device may not be ready to accept data as fast as

processor stores it

  • What to do?
slide-15
SLIDE 15

ECE232: Computer I/O 15

Processor Checks Status before Acting: Polling

  • Path to device generally has 2 registers:
  • 1 register says it’s OK to read/write

(I/O ready), often called Control Register

  • 1 register that contains data, often called

Data Register

  • Processor reads from Control Register in loop, waiting for

device to set Ready bit in Control reg to say its OK (0  1)

  • Processor then loads from (input) or writes to (output) data

register

  • Load from device/Store into Data Register resets Ready

bit (1  0) of Control Register

slide-16
SLIDE 16

ECE232: Computer I/O 16

Cost of Polling?

  • Assume: a 1 GHz processor takes 400 clock cycles for a

polling operation (call polling routine, accessing the device, and returning). Determine % of processor time for polling

  • Mouse: polled 30 times/sec - not to miss user movement
  • Hard disk: transfers data in 16-byte chunks and can

transfer at 8 MB/second. No transfer can be missed

  • Mouse Polling Clocks/sec = 30 * 400 = 12000 clocks/sec
  • % Processor for polling = 12*103/1*109 = 0.0012% 

Polling mouse has little impact on processor

  • Times Polling Disk/sec = 8 MB/s /16B = 500K polls/sec
  • Disk Polling Clocks/sec

= 500K * 400 = 200,000,000 clocks/sec

  • % Processor for polling:
  • 2*108/1*109 = 20%  Unacceptable
slide-17
SLIDE 17

ECE232: Computer I/O 17

What is the alternative to polling? Interrupt

  • Wasteful to have processor spend most of its time “spin-

waiting” for I/O to be ready

  • Wish we could have an unplanned procedure call that would

be invoked only when I/O device is ready

  • Solution: use exception mechanism to help I/O. Interrupt

program when I/O ready, return when done with data transfer Polling is like picking up the phone every few seconds to see if you have a call. Interrupt is like letting the phone ring

slide-18
SLIDE 18

ECE232: Computer I/O 18

I/O Interrupt

  • Controller sends interrupt to the processor along with

additional information

  • which device
  • nature of interrupt: error, no paper, no ink,…
  • Processor halts execution of current program
  • Saves State
  • Processor looks up which handler to start from the interrupt

information

  • When interrupt is handled, returns to program state and

resumes

slide-19
SLIDE 19

ECE232: Computer I/O 19

Interrupt Driven Data Transfer

(1) I/O interrupt (2) save PC (3) interrupt service addr

Memory

add sub and

  • r

user program read store ... jr interrupt service routine (4) (5)

 

slide-20
SLIDE 20

ECE232: Computer I/O 20

Benefit of Interrupt-Driven I/O

  • 500 clock cycle overhead for each transfer, including
  • interrupt. Find the % of processor consumed if the hard disk

is only active 5% of the time

  • If interrupt rate = polling rate
  • Disk Interrupts/sec = 8 MB/s /16B

= 500K interrupts/sec

  • Disk Polling Clocks/sec = 500K * 500

= 250,000,000 clocks/sec

  • % Processor used during transfers: 250*106/1*109= 25%
  • If disk active 5%  5% * 25%  1.25% busy
slide-21
SLIDE 21

ECE232: Computer I/O 21

Interrupts – Multiple devices

  • Aggregates interrupts
  • Prioritization

(network, keyboard,..) Processor Advanced Priority Interrupt Controller (APIC) Device 1 Device 2 Device n Device i

slide-22
SLIDE 22

ECE232: Computer I/O 22

Interrupt vs. Polling

  • Which is better: Interrupts or Polling?
  • Interrupts are better if the processor has something else

to do and the time-to-response is not critical

  • Polling is better if the processor has to respond to an event

ASAP

  • Polling is also used when data is expected at regular

intervals such as in a modem

  • Modem typically connects to a “com” port
  • The “com” port can be polled at expected intervals
slide-23
SLIDE 23

ECE232: Computer I/O 23

Direct Memory Access (DMA)

  • How to transfer large amounts of data between a Device and

Memory? Waste of CPU cycles if done through CPU

  • Let the device controller transfer data directly to and from

memory => DMA

  • The CPU sets up the DMA transfer by supplying the type of
  • peration, memory address and number of bytes to be

transferred

  • The DMA controller contacts the bus directly, provides

memory address and transfers the data

  • Once the DMA transfer is complete, the controller interrupts

the CPU to inform completion

  • Cycle Stealing – Bus gives priority to DMA controller thus

stealing cycles from the CPU

slide-24
SLIDE 24

ECE232: Computer I/O 24

OS control of I/O operations

  • Low-level control of I/O device is complex because it

requires managing a set of concurrent events and because requirements for correct device control are often very detailed

  • I/O systems often use interrupts to communicate

information about I/O operations and these can occur at a random time

  • The I/O system is shared by multiple programs using the

processor

  • Would like I/O services for all user programs under safe

control