Computer Architecture Foundations of Global Networked Computing: - - PowerPoint PPT Presentation

computer architecture
SMART_READER_LITE
LIVE PREVIEW

Computer Architecture Foundations of Global Networked Computing: - - PowerPoint PPT Presentation

IWKS 3300: NAND to Tetris Spring 2019 John K. Bennett Computer Architecture Foundations of Global Networked Computing: Building a Modern Computer From First Principles This course is based upon the work of Noam Nisan and Shimon Schocken. More


slide-1
SLIDE 1

Foundations of Global Networked Computing: Building a Modern Computer From First Principles

IWKS 3300: NAND to Tetris Spring 2019 John K. Bennett

This course is based upon the work of Noam Nisan and Shimon Schocken. More information can be found at (www.nand2tetris.org).

Computer Architecture

slide-2
SLIDE 2

A Brief History of Computer Architecture

slide-3
SLIDE 3

Abacus (2700–2300 BC; Sumeria) Sexagesimal (base-60) number system

slide-4
SLIDE 4

Blaise Pascal (1623-62)

slide-5
SLIDE 5

Jacquard Loom (1801)

slide-6
SLIDE 6

AVL Jacquard Loom (2016)

slide-7
SLIDE 7

Atanasoff–Berry Computer (1937-42; vacuum tubes)

slide-8
SLIDE 8

Zuse Z3 (1941-43; relays)

slide-9
SLIDE 9

Eniac (1943-46; vacuum tubes)

slide-10
SLIDE 10

Eniac

slide-11
SLIDE 11

Eniac

slide-12
SLIDE 12

Eniac

slide-13
SLIDE 13

Illiac I / OrdVac (1951)

Ordnance Discrete Variable Automatic Computer

slide-14
SLIDE 14

R1- Rice Research Computer (1959)

slide-15
SLIDE 15

IBM 360 (1964)

slide-16
SLIDE 16

R2- Rice Research Computer (1973)

slide-17
SLIDE 17

Hexadecimal Calculator

slide-18
SLIDE 18

Cray 1 (1975)

slide-19
SLIDE 19

Cray 1 (1975)

slide-20
SLIDE 20

Cray 2 (1985)

slide-21
SLIDE 21

The Xerox Alto, 1973 First personal workstation; first wide deployment of:  Bit-map graphics  Mouse  WYSIWYG editing Hosted the invention of:  Local-area networking  Laser printing  All of modern client / server distributed computing

slide-22
SLIDE 22

MITS Altair 8800 (1975)

slide-23
SLIDE 23

Basic on an 8-bit Computer

slide-24
SLIDE 24

Apple I (1975-6)

slide-25
SLIDE 25

Apple II (1977)

slide-26
SLIDE 26

TRS 80 (1977)

slide-27
SLIDE 27

Apple Lisa (1979)

slide-28
SLIDE 28

IBM PC (1981)

slide-29
SLIDE 29

UW Eden Node Machine (1982)

slide-30
SLIDE 30

IBM PC XT (1983)

slide-31
SLIDE 31

Apple Macintosh (1984)

slide-32
SLIDE 32

Sun 1-3 Workstations (1982-85)

slide-33
SLIDE 33

iPhone (2007)

slide-34
SLIDE 34

Amazon Kindle (2007)

slide-35
SLIDE 35

iPad (2009)

slide-36
SLIDE 36

Von Neumann Machine (circa 1940)

Arithmetic Logic Unit (ALU)

CPU

Registers Control

Memory

(data + instructions)

Input device Output device

Gordon Moore, Andy Grove (and others) ... made it small and fast. John Von Neumann (and others) ... made it possible

slide-37
SLIDE 37

Harvard Mark 1 (circa 1940)

Howard Aiken

slide-38
SLIDE 38

Arithmetic Logic Unit (ALU)

CPU

Registers Control

Memory

(data + instructions)

Input device Output device

Processing logic: fetch-execute cycle

Executing the current instruction involves one or more of the following tasks:

 Have the ALU compute some function out = f (register values)  Write the ALU output to selected registers  As a side-effect of this computation,

determine what instruction to fetch and execute next.

slide-39
SLIDE 39

The Hack chip-set and hardware platform

Elementary Gates

  • Nand
  • Not
  • And
  • Or
  • Xor
  • Mux
  • Dmux
  • Not16
  • And16
  • Or16
  • Mux16
  • Or8Way
  • Mux4Way16
  • Mux8Way16
  • DMux4Way
  • DMux8Way

Combinatorial Chips

  • HalfAdder
  • FullAdder
  • Add16
  • Inc16
  • ALU

Sequential Chips

  • DFF
  • Bit
  • Register
  • RAM8
  • RAM64
  • RAM512
  • RAM4K
  • RAM16K
  • PC

Computer Architecture

  • Memory
  • CPU
  • Computer

done done done this lecture

slide-40
SLIDE 40

The Hack Computer

Main parts of the Hack computer:

Instruction memory (ROM)

Memory (RAM):

  • Data memory
  • Screen (memory map)
  • Keyboard (memory map)

CPU

Computer (the framework that holds everything together).

  • A 16-bit Harvard platform (data and instructions are separate)
  • The instruction memory and the data memory are physically separate
  • Screen: 512 rows by 256 columns, black and white
  • Keyboard: standard (memory-mapped to a specific RAM address)
  • Designed to execute programs written in the Hack machine language
  • Can be easily built from the chip-set that we have built so far in the course
slide-41
SLIDE 41

Where We Are Headed

Data Memory (Memory) instruction CPU Instruction Memory (ROM32K) inM

  • utM

addressM writeM pc reset

CHIP Computer { IN reset; PARTS: // implementation missing }

slide-42
SLIDE 42

Instruction Memory

  • ut

15 16

address ROM32K

Function:

The ROM is pre-loaded with a program written in the Hack machine language

The ROM chip always emits a 16-bit number:

  • ut = ROM32K[address]

This number is interpreted as the current instruction.

ROM[2n] is implemented using 8 ROM[2n-3]’s, an 8:1 multiplexor, and a 1:8 demultiplexor.

slide-43
SLIDE 43

Data Memory

Function:

When read, the RAM16K chip always emits a 16-bit value:

  • ut = RAM16K[address]

A 16 bit value is required to write to the RAM16K chip:

RAM16K[address] = in & load & clock 

RAM[2n] is implemented using 4 or 8 RAM[2n-3]’s, an 8:1 multiplexor, and a 1:8 demultiplexor.

  • ut

in

16 15 16

address

RAM16K

load

slide-44
SLIDE 44

Data Memory

Low-level (hardware) read/write logic:

To read RAM[k]: set address to k, probe out To write RAM[k]=x: set address to k, set in to x, set load to 1, run the clock

High-level (OS) read/write logic:

To read RAM[k]: use the OS command out = peek(k) To write RAM[k]=x: use the OS command poke(k,x)

peek and poke are OS commands whose implementation should effect the same behavior

as the low-level commands More about peek and poke this later in the course, when we write the OS.

  • ut

in

16 15 16

address

RAM16K

load

slide-45
SLIDE 45

Screen

The Screen chip emulates basic RAM chip functionality:

 read logic: out = Screen[address]  write logic: if load then Screen[address] = in

Side effect: Continuously refreshes a 256 by 512 black-and-white screen device

load

  • ut

in

16 15 16

address Screen

Physical Screen

The bit contents of the Screen chip are called the “screen memory map”

The simulated 256 by 512 B&W screen

When loaded into the hardware simulator, the built-in Screen.hdl chip

  • pens up a screen

window; the simulator then refreshes this window from the screen memory map several times each second.

Simulated screen:

slide-46
SLIDE 46

Screen Memory Map

How to set the (row,col) pixel of the screen to black or to white:

 Low-level (machine language): Set the col%16 bit of the word found at

Screen[row*32+col/16] to 1 or to 0 (col/16 is integer division)

 High-level: Use the OS command drawPixel(row,col)

(effects the same operation, discussed later in the course, when we write the OS).

1 255 . . . . . . 1 2 3 4 5 6 7 511 0011000000000000 0000000000000000 0000000000000000 1 31 . . . row 0 0001110000000000 0000000000000000 0000000000000000 32 33 63 . . . row 1 0100100000000000 0000000000000000 0000000000000000 8129 8130 8160 . . . row 255 . . . . . . . . . . . . refresh several times each second

Screen

In the Hack platform, the screen is implemented as an 8K 16-bit RAM chip.

16384 (0x4000)

slide-47
SLIDE 47

Keyboard

Keyboard chip: a single memory-mapped 16-bit register Input: scan-code (16-bit value) of the currently pressed key, or 0 if no key is pressed Output: same

  • ut

16

Keyboard

Keyboard

How to read the keyboard:

 Low-level (hardware): probe the contents of the Keyboard chip at RAM location:

24576 (0x6000)

 High-level: use the OS command keyPressed()

(effects the same operation, discussed later in the course, when we write the OS).

Special keys:

The keyboard is implemented as a built-in Keyboard.hdl chip. When this java chip is loaded into the simulator, it connects to the regular keyboard and pipes the scan-code of the currently pressed key to the keyboard memory map.

The simulated keyboard enabler button

Simulated keyboard:

slide-48
SLIDE 48

Memory Physical Implementation

Access logic:

Access to any address from 0 to 16,383 results in accessing the RAM16K chip-part

Access to any address from 16,384 to 24,575 results in accessing the Screen chip-part

Access to address 24,576 results in accessing the keyboard chip-part

Access to any other address is invalid.

load

  • ut

in

16 15 16

RAM16K

(16K mem. chip)

address

16383

Screen

(8K mem. chip) 16384 24575 24576

Keyboard

(one register)

Memory

Keyboard Screen

The Memory chip is essentially a package that integrates the three chip-parts RAM16K, Screen, and Keyboard into a single, contiguous address space. This packaging effects the programmer’s view of the memory, as well as the necessary I/O side-effects.

slide-49
SLIDE 49

Memory: Programmer’s View

Using the memory:

To record or recall values (e.g. variables, objects, arrays), use the first 16K words of the memory: (0x0000-3FFF)

To write to the screen (or read the screen), use the next 8K words of the memory: (0x4000-5FFF)

To read which key is currently pressed, use the next word of the memory: (0x6000).

Data Screen memory map Keyboard map

Memory

Keyboard Screen

slide-50
SLIDE 50

LogicCircuit Keyboard Configuration

How to read the keyboard:

 Low-level (hardware): probe the contents of the Keyboard chip at RAM location:

24576 (0x6000)

 High-level: use the OS command keyPressed()

(effects the same operation, discussed later in the course, when we write the OS).

slide-51
SLIDE 51

LogicCircuit Configuration for the Hack Screen

The Screen chip emulates basic RAM chip functionality:

 read logic: out = Screen[address]  write logic: if load then Screen[address] = in

Side effect: Continuously refreshes a 256 by 512 black-and-white screen device

slide-52
SLIDE 52

LogicCircuit Configuration of the Hack Instruction Memory

Access logic:

Access to any address from 0 to 16,383 results in accessing the RAM16K chip-part

Access to any address from 16,384 to 24,575 results in accessing the Screen chip-part

Access to address 24,576 results in accessing the keyboard chip-part

Access to any other address is invalid.

slide-53
SLIDE 53

LogicCircuit Configuration of the Hack Data Memory.

Access logic:

Access to any address from 0 to 16,383 results in accessing the RAM16K chip-part

Access to any address from 16,384 to 24,575 results in accessing the Screen chip-part

Access to address 24,576 results in accessing the keyboard chip-part

Access to any other address is invalid.

slide-54
SLIDE 54

CPU Operation

instruction inM

16 1 15 15 16

  • utM

16

writeM addressM pc reset

1

CPU

to data memory to instruction memory from data memory from instruction memory

CPU internal components (invisible in this chip diagram): ALU and 3 registers: A, D & PC CPU execute logic: The CPU executes the instruction according to the Hack language specification:

The D and A values, if they appear in the instruction, are read from (or written to) the respective CPU-resident registers

The M value, if there is one in the instruction operand, is read from inM

If the instruction’s result includes M, then the ALU output is placed in outM, the value of the CPU-resident A register is placed in addressM, and writeM is asserted.

a Hack machine language instruction like M=D+M, represented as a 16-bit value

slide-55
SLIDE 55

CPU Operation

instruction inM

16 1 15 15 16

  • utM

16

writeM addressM pc reset

1

CPU

to data memory to instruction memory from data memory from instruction memory

CPU internal components (invisible in this chip diagram): ALU and 3 registers: A, D, PC CPU fetch logic: Recall that:

  • 1. the instruction may include a jump directive (if the jump bits are non-zero)
  • 2. the ALU emits two control bits, indicating if the ALU output is zero or less than zero

If reset==0: the CPU uses this information (the jump bits and the ALU control bits) as follows: If there should be a jump, the PC is set to the value of A; else, PC is set to PC+1 If reset==1: the PC is set to 0. (thus restarting the computer)

a Hack machine language instruction like M=D+M, represented as a 16-bit value

slide-56
SLIDE 56

The C-instruction Revisited

jump dest comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

binary:

dest = comp; jump

slide-57
SLIDE 57

Execute logic:

 Decode  Execute

Fetch logic: If there should be a jump, set PC to A else set PC to PC+1

ALU

Mux

D

Mux

reset inM addressM pc

  • utM

A/M

instruction

decode

C C C C C

D A PC

C C

A A A M ALU output

writeM

C C

jump dest comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

binary:

dest = comp; jump

CPU Implementation

Cycle:

 Execute  Fetch

Resetting the computer: Set reset to 1, then set it to 0. CPU Schematic:

 Includes most of the

CPU’s execution logic

 The CPU’s control logic is

hinted: each circled “c” represents one or more control bits, taken from the instruction

 The “decode”

bar does not represent a chip, but rather indicates that the instruction bits are decoded (somehow).

slide-58
SLIDE 58

ALU

Mux

D

Mux

reset inM addressM pc

  • utM

A/M

instruction

decode

C C C C C

D A PC

C C

A A A M ALU output

writeM

C C

jump dest comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

binary:

dest = comp; jump

CPU Implementation Issue #1: How to interpret instructions like A=A+1;JMP or A=M;JMP?

HDL Implementation of this alternative: // If A is changing, use its new value for the target address Mux16 (sel=loadA, a=aReg, b=aIn, out=jmpAddr); PC (in=jmpAddr, reset=reset, inc=true, load=jmp, out[0..14]=pc); The CPU as diagrammed in the book uses the initial value of the A-register for the jump address, which is perhaps counter- intuitive. An alternative implementation would use the new value of A or M to load A. One could implement this alternative by adding a multiplexor that selects the A-register input value for the jump address during instructions that change the A-register.

x

c

slide-59
SLIDE 59

ALU

Mux

D

Mux

reset inM addressM pc

  • utM

A/M

instruction

decode

C C C C C

D A PC

C C

A A A M ALU output

writeM

C C

jump dest comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

binary:

dest = comp; jump

CPU Implementation Issue #2: How to handle potential instruction execution asynchrony?

The CPU as diagrammed in the book glosses over the order in which the A, D, and PC registers, and memory, change. An as-described implementation

  • f the Hack architecture using a

simulator (or hardware) that notices this asynchrony can create problems, e.g., if the PC changes as the result of a jump before the instruction has completely finished executing. One way to address this issue would be to add an instruction register that would remain stable throughout the instruction execution cycle.

Instruction Register

slide-60
SLIDE 60

Computer Interface

Computer reset

Keyboard Screen

slide-61
SLIDE 61

Computer Implementation

Data Memory (Memory) instruction CPU Instruction Memory (ROM32K) inM

  • utM

addressM writeM pc reset

CHIP Computer { IN reset; PARTS: // implementation missing }

Implementation: Simple, the chip-parts do all the work.

slide-62
SLIDE 62

LogicCircuit Implementation of the Hack Computer

slide-63
SLIDE 63

Perspective: What We Have Left Out

 Caching  Instruction pipelining  More I/O units  Special-purpose processors (I/O, graphics, communications, …)  Multi-core / parallelism  Efficiency considerations  Energy consumption considerations  And a bunch of other stuff (take a good computer architecture course!)