Unit 16 Computer Organization Design of a Simple Processor 16.2 - - PowerPoint PPT Presentation

unit 16
SMART_READER_LITE
LIVE PREVIEW

Unit 16 Computer Organization Design of a Simple Processor 16.2 - - PowerPoint PPT Presentation

16.1 Unit 16 Computer Organization Design of a Simple Processor 16.2 You Can Do That Cloud & Distributed Computing Scripting & (CyberPhysical, Databases, Data Networked Interfaces Mining,etc.) Applications Applications SW


slide-1
SLIDE 1

16.1

Unit 16

Computer Organization Design of a Simple Processor

slide-2
SLIDE 2

16.2

You Can Do That…

C / C++ / Java Logic Gates Transistors

HW SW

Voltage / Currents Assembly / Machine Code Applications Libraries OS Processor / Memory / I/O Functional Units (Registers, Adders, Muxes)

Devices & Integrated Circuits (Semiconductors & Fabrication) Architecture (Processor & Embedded HW) Systems & Networking (Embedded Systems, Networks) Applications (AI, Robotics, Graphics, Mobile) Cloud & Distributed Computing (CyberPhysical, Databases, Data Mining,etc.)

Scripting & Interfaces Networked Applications

Where we will head now…

slide-3
SLIDE 3

16.3

Motivation

  • Now that you have some understanding…

– Of how hardware is designed and works – Of how software can be used to control hardware

  • We will look at how to improve efficiency of

computer systems and software so that…

– …we can start to understand why HW companies create the structures they do (multicore processors) – …we can begin to intelligently take advantage of the capabilities the HW gives us – …we can start to understand why SW companies deal with some of the issues they do (efficiencies, etc.)

slide-4
SLIDE 4

16.4

Computer Organization

  • Three primary sets of

components

– Processor – Memory – I/O (everything else)

  • Tell us where things live?

– Running code – Compiled program (not running) – Circuitry to execute code – Source code file – Data variables – Data for the pixels being displayed on your screen

slide-5
SLIDE 5

16.5

Input / Output

  • Processor performs reads and writes to communicate with I/O

devices just as it does with memory

– I/O devices have locations (i.e. registers) that contain data that the processor can access – These registers are assigned unique addresses just like memory

Video Interface

FE may signify a white dot at a particular location … 800

Processor Memory

A D C 800 FE WRITE … 3FF FE 01

Keyboard Interface

61 400 ‘a’ = 61 hex in ASCII This could just as easily be the command and data register from the LCD shield… Or the PORT/DDR registers.

slide-6
SLIDE 6

16.6

Processor

  • 3 Primary Components inside a processor

– ALU – Registers – Control Circuitry

  • Connects to memory and I/O via address, data, and control

buses (bus = group of wires)

Processor

Addr Data Control

Memory

1 2 3 4 5 6

Bus

slide-7
SLIDE 7

16.7

Arithmetic and Logic Unit (ALU)

  • Executes arithmetic operations like addition

and subtraction along with logical operations (AND, OR, etc.)

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut
slide-8
SLIDE 8

16.8

Registers

  • Some are for general use by software

– Registers provide fast, temporary storage locations within the processor (to avoid having to read/write slow memory)

  • Others are required for specific purposes to ensure

proper operation of the hardware

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

PC R0-R15

slide-9
SLIDE 9

16.9

General Purpose Registers

  • Registers available to software instructions for use

by the programmer/compiler

  • Instructions use these registers as inputs (source

locations) and outputs (destination locations)

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

R0-R15 PC

slide-10
SLIDE 10

16.10

What if we didn’t have registers?

  • Example w/o registers: F = (X+Y) – (X*Y)

– Requires an ADD instruction, MULtiply instruction, and SUBtract Instruction – w/o registers

  • ADD: Load X and Y from memory, store result to memory
  • MUL: Load X and Y again from mem., store result to memory
  • SUB: Load results from ADD and MUL and store result to memory
  • 9 memory accesses

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

R0-R15 X Y F PC

slide-11
SLIDE 11

16.11

What if we have registers?

  • Example w/ registers: F = (X+Y) – (X*Y)

– Load X and Y into registers – ADD: R0 + R1 and store result in R2 – MUL: R0 * R1 and store result in R3 – SUB: R2 – R3 and store result in R4 – Store R4 back to memory – 3 total memory access

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

R0-R15 X Y X Y F PC

slide-12
SLIDE 12

16.12

Other Registers

  • Some bookkeeping information is needed to make the

processor operate correctly

  • Example: Program Counter (PC)

– Recall that the processor must fetch instructions from memory before decoding and executing them – PC register holds the address of the next instruction to fetch

Processor

Addr Data Control

Memory

1 2 3 4 5 6

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

PC R0-R15

slide-13
SLIDE 13

16.13

Fetching an Instruction

  • To fetch an instruction

– PC contains the address of the instruction – The value in the PC is placed on the address bus and the memory is told to read – The PC is incremented, and the process is repeated for the next instruction

Processor

Addr Data Control

Memory

  • inst. 2

1 2 3 4 FF

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

PC R0-R15

  • inst. 1
  • inst. 3
  • inst. 4
  • inst. 5

PC = Addr = 0 Data = inst.1 machine code Control = Read

slide-14
SLIDE 14

16.14

Fetching an Instruction

  • To fetch an instruction

– PC contains the address of the instruction – The value in the PC is placed on the address bus and the memory is told to read – The PC is incremented, and the process is repeated for the next instruction

Processor

Addr Data Control

Memory

  • inst. 2

1 2 3 4

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

1

PC R0-R15

  • inst. 1
  • inst. 3
  • inst. 4
  • inst. 5

PC = Addr = 1 Data = inst.2 machine code Control = Read FF

slide-15
SLIDE 15

16.15

Control Circuitry

  • Control circuitry is used to decode the instruction and then

generate the necessary signals to complete its execution

  • Controls the ALU
  • Selects registers to be used as source and destination

locations (using muxes)

Processor

Addr Data Control

ALU

ADD, SUB, AND, OR

  • p.

in1 in2

  • ut

R0-R15

Control Memory

  • inst. 2

1 2 3 4

  • inst. 1
  • inst. 3
  • inst. 4
  • inst. 5

PC FF

slide-16
SLIDE 16

16.16

Control Circuitry

  • Assume 0x0201 is machine code for an ADD instruction of

R2 = R0 + R1

  • Control Logic will…

– select the registers (R0 and R1) – tell the ALU to add – select the destination register (R2)

Processor

Addr Data Control

ALU

ADD

ADD in1 in2

  • ut

PC R0-R15

Control Memory

  • inst. 2

1 2 3 4

0201

  • inst. 3
  • inst. 4
  • inst. 5

0201 FF

… Opcode (4-bits)

  • Dst. Reg.

(4-bits) Src1 Reg. (4-bits) Src2 Reg. (4-bits)

slide-17
SLIDE 17

16.17

INSTRUCTION SET OVERVIEW

slide-18
SLIDE 18

16.18

Instruction Set Overview

  • The instruction set defines the software interface to

the processor and memory system

  • Most processors define their own unique instruction

set unless they are trying to be compatible with some other vendor

– Which means code compiled for one processor will not run

  • n another
  • Instruction set is the vocabulary the HW processor

can understand and the SW is composed with

– Usually the compiler is the one that translates the high level software into the 1s and 0s (aka machine code) that control the processor

slide-19
SLIDE 19

16.19

Components of the Instruction Set

  • Instruction sets specify…

– Maximum bit widths for data and addresses

  • 8-, 16-, 32- or 64-bit

– Which instructions are implemented

  • ADD, NEGate, SUB, MUL

– How many registers the processor provides to instructions

  • Usually 16 to 64

– How each instruction is converted to binary (aka "machine code")

slide-20
SLIDE 20

16.20

Instruction Set Architecture (ISA)

  • 2 instruction set approaches

– CISC = Complex instruction set computer

  • Large, rich set of instructions (vocabulary)
  • Useful when programmers actually wrote assembly code
  • More work per instruction, slower clock cycle

– RISC = Reduced instruction set computer

  • Small, basic, but sufficient instruction set (vocabulary)
  • With maturation of compilers programming at assembly level is

more rare

  • Less work per instruction, faster clock cycle
  • Usually a simple and small set of instructions with regular format

facilitates building faster processors

slide-21
SLIDE 21

16.21

Kinds of Instructions

  • Most assembly/machine instructions fall into one of three

categories

  • Arithmetic/Logic

– Example: ADD r1, r2, r3 // r1 = r2 + r3

  • Data Transfer (to and from memory)

– Example: LOAD r1, addr // set register 1 to the value in mem. at addr

  • Control

– Example: JUMP addr // Go to instruction in mem. at addr

  • Notice that each instruction has predefined interpretation of the
  • perands

– LOAD r1, addr // interprets 1st operand as a register number and 2nd as an address – The interpretation of the meaning of the operand is part of the instruction set and known as "addressing modes"

slide-22
SLIDE 22

16.22

Operands

  • Addressing modes refers to how an

instruction specifies where the

  • perands are
  • Can be in a

– register, – memory location, or a – constant that is part of the instruction itself (aka. immediate value)

  • Most RISC processors: All data
  • perands for arithmetic instructions

must be in a register

– This allows the hardware to be simpler and faster

Proc. Mem.

Registers

ADD r1, r2, r3 Opcode Reg. # Reg. # Reg. #

Proc. Mem.

Registers

LOAD r5, 0x1ac8 Opcode Reg. # Address

Proc. Mem.

Registers

ADD r5, r3, immediate Opcode Reg. # Immediate Reg. #

slide-23
SLIDE 23

16.23

DESIGN OF A SIMPLE INSTRUCTIONS SET AND PROCESSOR

slide-24
SLIDE 24

16.24

What Shall We Do?

  • Let's design a simple processor to understand

the entire flow from writing software to designing the hardware

– This may not be the most advanced processor but the goal is to give you a fully working example from software to hardware

slide-25
SLIDE 25

16.25

The Instruction Set (1)

  • To start we will define the instruction set
  • Let's use 4-bit data values (i.e. all data operands will be 4-bits)
  • Let's make this a simple calculator-like processor that can

perform at least the following 3 operations:

– ADD – SUB – AND

  • Goal is to evaluate simple arithmetic expressions: (7+4-5)&3
  • To keep the number of bits needed to code an instruction to a

minimum, let's use an ACCumulator-based architecture where the ACC register is always one implied operand

– ADD 7 means: ACC += 7 – SUB 5 means: ACC -= 5

slide-26
SLIDE 26

16.26

The Instruction Set (2)

  • Let's assume the output of this computer is just 4 LED's

to display a 4-bit binary number

  • We'll provide some additional instructions to help us

perform the calculations:

– Load constant (Acc = const) – Clear (Acc = 0) – Out (OUT = Acc)

  • That leaves us with 6 total instructions

– How many bits do we need for the opcode of our instructions? 3-bits

  • If we want to store data/constants in our instructions

(e.g. ADD 7, SUB 5) how many additional bits do we need in our instruction? 4-bits

  • Instructions need 3 opcode + 4 data bits = 7-bits

– Let's round up to 8-bits for each instruction

Output LEDs (Display = 7 = 01112)

Opcode (3-bits)

Unu sed

Constant (4-bits)

Chosen Instruction Format

Computer System

7 6 5 4 3 2 1 0

slide-27
SLIDE 27

16.27

Compilation

  • Consider the following "high-level" code

– (7 - 4 + 6) & 3

  • "Compile" it to an appropriate instruction

sequence (i.e assembly)

– Assembly refers to the human readable syntax

  • f each instruction

– CLR – ADD 7 – SUB 4 – ADD 6 – AND 3

  • Now we need to convert to binary…

Instruction Set Summary

  • ADD k (ACC += k)
  • SUB k (ACC -= k)
  • AND k (ACC &= k)
  • LOAD k (ACC = k)
  • CLR (ACC = 0)
  • OUT (OUT = ACC)
slide-28
SLIDE 28

16.28

Defining the Machine Code

  • Machine code refers to the binary

representation of each instruction.

  • We first need to define the actual opcodes so

we can translate the assembly you wrote on the previous slide into binary for the hardware to execute

  • Before we do that, let's consider the hardware

design as this will help us choose appropriate

  • pcodes
slide-29
SLIDE 29

16.29

Arithmetic and Logic Units

  • Let's use the ALU we designed in a previous

unit…

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

F[2:0] Op./Result 000 R = X + Y 001 R = X - Y 010 R = X 011 R = Y - X 100 R = X & Y 101 Unused 110 R = 0 111 Unused

We will design what is inside this block. We just made up these code assignments and the various operations. Remember, we definitely need to support ADD, SUB, AND, and CLR (R=0).

slide-30
SLIDE 30

16.30 A0 A1 A2 A3 B0 B1 B2 B3 4-bit Binary Adder C0 C4 S0 S1 S2 S3 S I0 Y I1 2-to-1, 4-bit wide mux X0 X1 X2 X3 Y0 Y1 Y2 Y3 R0 R1 R2 R3 F0 F1 F2 S I0 Y I1 2-to-1, 4-bit wide mux S I0 Y I1 2-to-1, 4-bit wide mux X0 X1 X2 X3 S1 = F1 F0' S2 = F1' F0 S3 = F2 EE109 ALU

EE109 ALU

S I0 Y I1 2-to-1, 4-bit wide mux S0 = F1 F0 Ci=F0

Completed ALU

F[2:0] Op. F[2:0] Op. 000 R = X + Y 100 R = X & Y 001 R = X - Y 101 Unused 010 R = X 110 R = 0 011 R = Y - X 111 Unused

slide-31
SLIDE 31

16.31

Control Logic

  • S0 = F1•F0
  • S1 = F1•F0'
  • S2 = F1'•F0
  • Ci = F0
  • S3 = F2

R FS[2:0] S0 S1 S2 Ci S3 X+Y 000 X-Y 001 1 1 X 010 1 Y-X 011 1 1 X & Y 100 d 1 unused 101 d d d d d 110 d 1 d 1 unused 111 d d d d d

1 00 01 11 10 F0 F2F1 d d 1 1 d 1 00 01 11 10 F0 F2F1 d d d 1 1 00 01 11 10 F0 F2F1 d 1 d 1 d 1 00 01 11 10 F0 F2F1 d d 1 1 1 00 01 11 10 F0 F2F1 d d 1 S0 S1 S2 Ci S3

slide-32
SLIDE 32

16.32

Defining the Machine Code Format

  • Using the ALU design can you suggest opcodes for the various

instructions?

– The accumulator (ACC) will be connected to the result of the ALU – But should the ACC be connected to the X or Y input of the ALU?

  • Important: We achieve Load by passing X through the ALU to the ACC, so we need

the constant to come in on X (so ACC cannot) F[2:0] Op./Result 000 R = X + Y 001 R = X - Y 010 R = X 011 R = Y - X 100 R = X & Y 101 Unused 110 R = 0 111 Unused Instruc. OPCODE Op./Result ADD 000 ACC = ACC + C OUT 001 OUT = ACC LOAD 010 ACC = C SUB 011 ACC = ACC - C AND 100 ACC = ACC & C

  • 101

Unused CLR 110 ACC = 0

  • 111

Unused Instruction Set Summary

  • ADD k (ACC += k)
  • SUB k (ACC -= k)
  • AND k (ACC &= k)
  • LOAD k (ACC = k)
  • CLR (ACC = 0)
  • OUT (OUT = ACC)

+ =

slide-33
SLIDE 33

16.33

Assembler

  • Now translate the assembly you

found from a few slides back to machine code and show it as 2 hex digits per instruction

  • The "high-level" code was

– (7 - 4 + 6) & 3

  • "Compile" it to an appropriate

instruction sequence (i.e assembly)

– CLR = 0xc0 – ADD 7 = 0x07 – SUB 4 = 0x64 – ADD 6 = 0x06 – AND 3 = 0x83

Opcode (3-bits)

Unu sed

Constant (4-bits)

Chosen Instruction Format

Instruc. OPCODE Op./Result ADD 000 ACC = ACC + C OUT 001 OUT = ACC LOAD 010 ACC = C SUB 011 ACC = ACC - C AND 100 ACC = ACC & C

  • 101

Unused CLR 110 ACC = 0

  • 111

Unused

7 6 5 4 3 2 1 0

slide-34
SLIDE 34

16.34

Processor Datapath

  • Now let's consider the processor data path

Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

Control

LEDs

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD

I7 I6 I5

ACC_LD

I7 I6 I5

ACC_LD

I7 I6 I5

  • r
slide-35
SLIDE 35

16.35

Sample Execution of SUB 11

Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

Control

LEDs

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD

I7 I6 I5

ACC_LD

I7 I6 I5

ACC_LD

I7 I6 I5

  • r

1 1 1 1 1 0 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1

slide-36
SLIDE 36

16.36

A Problem

  • Write assembly for:

– ((7 & 3) + (6 & 5))

  • LOAD 7
  • AND 3
  • No place to store result so we

can compute (6&5) separately

  • No place to store "temporary" results
slide-37
SLIDE 37

16.37

A Solution

  • Let's modify our processor as follows:

– Add two registers for temporary storage: R0 and R1

  • Could add more but we'll keep it simple

– A new instruction to save the ACC to a register: SAVE Rx (Rx = ACC) – Update ALU instructions to be able to specify a register operand rather than just a constant ADD Rx (ACC = ACC + Rx) SUB Rx (ACC = ACC - Rx) AND RX (ACC = ACC & Rx) LOAD Rx (ACC = Rx) – Update the instruction format to use the leftover bit to indicate whether the operand is a constant or should come from a register

Opcode (3-bits) C/R Constant (4-bits) Opcode (3-bits) C/R 1

New Instruction Format

Unused (3-bits) Reg 0/1

7 6 5 4 3 2 1 0

slide-38
SLIDE 38

16.38

Updated Assembly

  • Write assembly for:

– ( (7 & 3) + (6 & 5) )

  • New assembly & machine code

– LOAD 7 = 0x47 – AND 3 = 0x83 – SAVE R1 = 0xf1 – LOAD 6 = 0x46 – AND 5 = 0x85 – ADD R1 = 0x11 – OUT = 0x20

Opcode (3-bits) C/R Constant (4-bits) Opcode (3-bits) C/R 1 Unused (3-bits) Reg 0/1

New Instruction Format

Instruc. OPCODE Op./Result ADD 000 ACC = ACC + C/R OUT 001 OUT = ACC LOAD 010 ACC = X SUB 011 ACC = ACC - C/R AND 100 ACC = ACC & C/R

  • 101

Unused CLR 110 ACC = 0 SAVE Rx 111 Rx = ACC

7 6 5 4 3 2 1 0

slide-39
SLIDE 39

16.39

Updated Processor Datapath

D[3:0] Q[3:0] EN CLK D[3:0] Q[3:0] EN CLK Data Registers ACC[3:0] R0_LD R1_LD CLK CLK Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

S I0 Y I1 2-to-1, 4-bit wide mux S I0 Y I1 2-to-1, 4-bit wide mux

Control

R0 R1

I0 I4

LEDs R0[3:0] R1[3:0] R0_LD

I0

R1_LD

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7

I7 I6 I5 I0 I7 I6 I5

OUT_LD

I7 I6 I5

ACC_LD

I6 I5 I7 I5

slide-40
SLIDE 40

16.40

OTHER INSTRUCTION SETS…

slide-41
SLIDE 41

16.41

Historical Instruction Format Options

  • Instruction sets limit the number of operands used in an instruction due to…

– To limit the complexity of the hardware – So that when an instruction is coded to binary it can fit in a certain # of bits

  • Different instruction sets specify these differently

– 3 operand instruction set (ARM, PPC) -> (32-bit processors)

  • Usually all 3 operands in registers
  • Format: ADD DST, SRC1, SRC2 (DST = SRC1 + SRC2)

– 2 operand instructions (Intel / Motorola 68K)

  • Second operand doubles as source and destination
  • Format: ADD SRC1, S2/D

(S2/D = SRC1 + S2/D)

– 1 operand instructions (Low-End Embedded, Java Virtual Machine)

  • Implicit operand to every instruction usually known as the Accumulator (or ACC)

register

  • Format: ADD SRC1

(ACC = ACC + SRC1)

– 0 operand instructions / stack architecture

  • Push operands on a stack: PUSH X, PUSH Y
  • ALU operation: ADD (Implicitly adds top two items on stack: X + Y

& replaces them with the sum)

slide-42
SLIDE 42

16.42

General Instruction Format Issues

  • Consider the high-level code

– F = X + Y – Z – G = A + B

  • Simple embedded computers often use single operand format

– Smaller data size (8-bit or 16-bit machines) means limited instruction size

  • Modern, high performance processors (Intel, ARM) use 2- and 3-operand formats

Three-Operand Two-Operand Single-Operand Stack Arch.

ADD F,X,Y SUB F,F,Z ADD G,A,B MOVE F,X ADD F,Y SUB F,Z MOVE G,A ADD G,B LOAD X ADD Y SUB Z STORE F LOAD A ADD B STORE G PUSH Z PUSH Y SUB PUSH X ADD POP F

(+) More natural program style (+) Smaller instruction count (+) Smaller size to encode each instruction

slide-43
SLIDE 43

16.43

MORE PRACTICE

slide-44
SLIDE 44

16.44

More Practice

  • Write assembly for:

– ( (4&14) + (5&3) - (6&11) + (8&13))

  • Try to use as few instructions as you can

– LOAD 6 – AND 11 – SAVE R0 – LOAD 4 – AND 14 – SAVE R1 – LOAD 5 – AND 3 – ADD R1 – SAVE R1 – LOAD 8 – AND 13 – ADD R1 – SUB R0 – OUT Opcode (3-bits) C/R Constant (4-bits) Opcode (3-bits) C/R 1 Unused (3-bits) Reg 0/1

New Instruction Format

Since we can only do ACC – C/R, it means we should already have the sum of the

  • ther terms in ACC and then subtract. To

compute 6&11 later would then require us to swap in the sum of the other terms into the ACC and then subtract, costing an extra instruction. Since we have many terms we can use R1 to keep "accumulating" the sum of more terms while we use the ACC to compute the current term.

7 6 5 4 3 2 1 0

slide-45
SLIDE 45

16.45

D[3:0] Q[3:0] EN CLK D[3:0] Q[3:0] EN CLK Data Registers ACC[3:0] R0_LD R1_LD CLK CLK Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

S I0 Y I1 2-to-1, 4-bit wide mux S I0 Y I1 2-to-1, 4-bit wide mux

Control

R0 R1

I0 I4

LEDs R0[3:0] R1[3:0] R0_LD

I0

R1_LD

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7

I7 I6 I5 I0 I7 I6 I5

OUT_LD

I7 I6 I5

ACC_LD

I6 I5 I7 I5

ADD 7

1 1 1 1 0 0 0

1 1 1 1 1 1 1 1 1 1 0101 0000 0000 1

slide-46
SLIDE 46

16.46

D[3:0] Q[3:0] EN CLK D[3:0] Q[3:0] EN CLK Data Registers ACC[3:0] R0_LD R1_LD CLK CLK Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

S I0 Y I1 2-to-1, 4-bit wide mux S I0 Y I1 2-to-1, 4-bit wide mux

Control

R0 R1

I0 I4

LEDs R0[3:0] R1[3:0] R0_LD

I0

R1_LD

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7

I7 I6 I5 I0 I7 I6 I5

OUT_LD

I7 I6 I5

ACC_LD

I6 I5 I7 I5

ADD R0

1 0 0 0 1

1 1 1 1 1 1 1 1 0101 0100 0000 1 1

slide-47
SLIDE 47

16.47

D[3:0] Q[3:0] EN CLK D[3:0] Q[3:0] EN CLK Data Registers ACC[3:0] R0_LD R1_LD CLK CLK Instruction Fetch Logic CLK 5-bit Counter CLR Q0 Q1 Q2 Q3 D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 32x8 Memory I0 I1 I2 I3 I4 I5 I6 I7 CLK RESET Q4 A4

S I0 Y I1 2-to-1, 4-bit wide mux S I0 Y I1 2-to-1, 4-bit wide mux

Control

R0 R1

I0 I4

LEDs R0[3:0] R1[3:0] R0_LD

I0

R1_LD

X0 X1 X2 X3 Y0 Y1 Y2 Y3 EE109 ALU R0 R1 R2 R3 F2 F1 F0

ACC_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 ACC[3:0] OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7 OUT_LD CLK EN CLK D3 D2 D1 D0 Q3 Q2 Q1 Q0 OUT[3:0] I5 I6 I7

I7 I6 I5 I0 I7 I6 I5

OUT_LD

I7 I6 I5

ACC_LD

I6 I5 I7 I5

SAVE R1

1 x 1 1 1 1 1 1 x

1 1 1 1 x x x x x x x x 1 0101 0000 0000 1 1