SLIDE 1 Foundations of Global Networked Computing: Building a Modern Computer From First Principles
IWKS 3300: NAND to Tetris Spring 2019 John K. Bennett
This course is based upon the work of Noam Nisan and Shimon Schocken. More information can be found at (www.nand2tetris.org).
Sequential Circuits
SLIDE 2
First, some follow-up from last time…
SLIDE 3 Demonstrating the Book’s ALU
x=0 x=!x y=!y y=0
+/AND if out = 0, zr = 1 if out < 0, ng = 1
SLIDE 4
The Book’s ALU
SLIDE 5
4:1 Multiplexor as a Function Generator A-D are the Function Select S0 and S1 are the Variables
SLIDE 6
A B C D Function
1 S0 • S1 1 S0 • !S1 1 1 S0 1 !S0 • S1 1 1 S1 1 1 S0 XOR S1 1 1 1 S0 + S1 1 !S0 • !S1 1 1 !(S0 XOR S1) 1 1 !S1 1 1 1 S0 + !S1 1 1 !S0 1 1 1 !S0 + S1 1 1 1 !S0 + !S1 1 1 1 1 1
4:1 Multiplexor as a Function Generator
SLIDE 7
Programmable Logic
SLIDE 8
Programmable Logic Example 16R4 PAL
SLIDE 9
Field Programmable Gate Array (FPGA)
SLIDE 10
FPGA Logic Arrays (simple and complex)
SLIDE 11
FPGA Special Cells Soft (e.g., MicroBlaze) and hard (e.g., PowerPC) processors Special pre- configured cells
SLIDE 12
Metal Oxide Transistor
SLIDE 13 Building Gates on Silicon
NAND
By Reza Mirhosseini - originally uploaded to en.wikipedia
SLIDE 14
Building Gates on Silicon
5 x 2:1 Multiplexor
SLIDE 15
Now, back to sequential circuits…
SLIDE 16
Sequential VS Combinatorial Circuits
Combinatorial devices: outputs are a function of inputs only – input changes will propagate to the output after a finite (usually small) propagation delay. Sequential devices: outputs are a function of both inputs and current state – inputs and current state are periodically sampled and new outputs are computed. Sequential devices are sometimes called “clocked devices,” because the period sampling is typically initiated by a “clock” signal, like a square wave: The low-level behavior of sequential circuits, particularly circuits that employ feedback, can be complex to analyze and design. The good news: In the Hack computer, all sequential chips are based upon a single low-level sequential gate, called a “D flip flop”, or DFF Clock-dependency details are encapsulated at the DFF level (almost) Higher-level sequential chips are built on top of DFF gates using combinatorial logic only.
SLIDE 17 Sequential VS combinational logic
- ut = some function of (in)
Combinational chip comb. logic
in
- ut
- ut(t) = some function of (in(t-1), out(t-1))
Sequential chip comb. logic
in
DFF gate(s) comb. logic
(optional) (optional) time delay
Best to not have combinatorial logic on both sides. The Book’s HDL simulator is OK with this; LogicCircuit less so. Why?
SLIDE 18
Basic S-R Flip Flop (no clock)
SLIDE 19 The Clock
clock signal
cycle cycle cycle cycle
tick tock tick tock tick tock tick tock
In Hack jargon, a clock cycle = tick-phase (low), followed by a tock-phase (high) In real hardware, the clock is implemented by an oscillator (a special circuit that, well, oscillates) In the Hack hardware simulator, clock cycles can be simulated either Manually, by the user, or “Automatically,” by a test script.
SLIDE 20
Implementing a Crystal Oscillator with NAND Gates
Practical Note: Design for 2x desired frequency and divide by 2; (Xtal oscillators have accurate frequency, but asymmetric period)
SLIDE 21
Implementing a Divide by Two
SLIDE 22 The Book’s D Flip-flop
A fundamental state-keeping device The book assumes that the DFF implementation is magic, but in fact it can be readily implemented using NAND gates. In the Hack computer, memory devices are made from numerous flip-flops, all regulated by the same master clock signal Notational convention (we sometimes omit the “CLK”):
sequential chip
in sequential chip
in clock signal
=
(notation)
DFF
in
SLIDE 23 Two Ways to Implement the Book’s D Flip-flop: 1
DFF
in
SLIDE 24 Two Ways to Implement the Book’s D Flip-flop: 2
DFF
in
SLIDE 25 You Do Not Have to Implement the Book’s D Flip-flop
DFF
in
(unless you want to simulate inside LogicCircuit) Just download (or create) the Nand and DFF templates
SLIDE 26
A Real D Flip Flop With Asynchronous Set’ and Reset’
SLIDE 27 Flip Flops in the Real World
The critical design parameters for a flip flop are setup and hold times, and propagation delay. Setup: The length of time before the clock edge that inputs must be stable (Tsu) Hold Time: The length of time after the clock edge that inputs must be stable (Th) Propagation Delay: The time it takes for outputs to become stable after the clock edge (Pd) Maximum Frequency can be calculated: Fmax = 1 / (Pd + Tsu)
Fmax Example: 74F74 Tsu = 3ns (from data sheet) Th = 1ns (from data sheet) Pd = 9ns (from data sheet) Fmax = 1/(12ns) = 83 MHz
Metastability: What happens if Tsu or Th are violated
This is a worst case example
SLIDE 28
D Flip Flop With Asynchronous Set’ and Reset’ From NANDs
CLK Q Q-
½ of 74x74
SLIDE 29
Divide By Two Circuit (book’s simulator will not accept this)
Q Q- D CLK
SLIDE 30
Implementing Edge Triggering (74x74 is edge triggered)
SLIDE 31
Q: Are there other kinds of FF’s?
A: Yes We have covered: SR (NAND) FF D (NAND) FF D (NAND) with async. S&R Positive-edge-triggered D FF Some others: JK FF D FF with sync S&R Toggle FF SR NOR FF SR AND-OR FF Gated SR FF Earle latch Master–slave D FF T FF Gated versions of all of these
SLIDE 32
What’s a JK Flip Flop?
SLIDE 33
JK Flip Flops Require Care in Design
SLIDE 34 Who invented the JK Flip Flop?
- a. Jack Kilby (what some think)
- b. Edward Nelson (what others think)
- c. We don’t know
Answer: I’m going to go with (c.). Here’s why:
1. Jack Kilby introduction at Rice University in 1973 “He has been credited with the invention of the JK flip flop.” Lot’s of web sites state that Kilby invented JK FFs. Kilby himself never made this
- claim. But he did win a Nobel Prize in Physics for invention of the
integrated circuit. 2. Kilby started at TI in 1958, and JK FF’s are mentioned in a 1953 patent (US 2,850,566) by Edward C. Nelson. Some web sites use this datum to assert that Nelson invented JK FFs. 3. But Nelson does not claim invention of JK FFs in the patent.
SLIDE 35 What’s in a Patent?
- 1. Specification – invention background and description
- 2. Claims – what is unique about the invention
Nelson’s Patent: “HIGH-SPEED PRINTING SYSTEM”
In the Specification: “…Each flip-flop or bistable multivibrator includes two input terminals, hereinafter termed the j-input and the k-input terminals, respectively, and two output terminals for producing complementary bi-valued electrical output signals hereinafter termed Q and Q, respectively. Signals applied separately to the j-input and k-input terminals set the flip-flop to conduction states corresponding to the binary values one and zero, respectively, while signals applied simultaneously to both input terminals trigger
- r change the conduction state of the flip-flop…”
Flip-flops are not mentioned in the claims, which all relate to printing. “What is claimed as new is: 1. A high-speed printing system for printing on a printing medium a line of intelligence information …”
SLIDE 36
The JK Flip-Flop was Patented in 1969 (filed in 1966) Miller did not disclose Nelson’s patent to the examiner. Miller’s patent is likely invalid due to obviousness and prior art.
SLIDE 37 JKB’s Best Guess The JK Flip Flop was likely invented sometime in the early days of vacuum tube computing by an engineer who was too busy getting things done to worry about
- patents. The idea probably became common
knowledge to digital designers by the early 1950’s. …Miller’s patent should have probably never issued. We now return to your regularly scheduled programming…
SLIDE 38 1-bit register (the book calls it “Bit”)
DFF
Basic building block in
Objective: build a storage unit that can: (a) Change its state to a given input (b) Maintain its state over time (until changed)
DFF
- ut(t) = out(t-1) ?
- ut(t) = in(t-1) ?
in
Bit
load in
if load(t-1) then out(t)=in(t-1) else out(t)=out(t-1)
The book, and the book’s simulator prohibit such feedback, but in practice, feedback can be very useful, e.g., div-by-2., although you cannot tie two normal outputs together.
SLIDE 39 Bit register (cont.)
Bit
load in
if load(t-1) then out(t)=in(t-1) else out(t)=out(t-1)
Interface
DFF
MUX
load in
Implementation
- Load bit
- Read logic
- Write logic
SLIDE 40 Bit Register Implementation
DFF
MUX
load in
Implementation
SLIDE 41 DFF
MUX
load in
Logical Design
Better Bit Register Implementation
A Bit from DFF (red circle) and 2:1 mux (green circle)
SLIDE 42 Multi-bit register
Bit
load in
if load(t-1) then out(t)=in(t-1) else out(t)=out(t-1)
1-bit register
- Register’s width: a trivial parameter
- Read logic
- Write logic
. . .
Bit
w-bit register
load in
w w
if load(t-1) then out(t)=in(t-1) else out(t)=out(t-1) Bit Bit
SLIDE 43
Multi-Bit (16 bit) Register
SLIDE 44
Aside: Hardware Simulation
Relevant topics from the HW simulator tutorial: Clocked chips: When a clocked chip is loaded into the simulator, the clock icon is enabled, allowing clock control Built-in chips: Feature a standard HDL interface (implemented in Java) Provide behavioral simulation of low-level parts May feature GUI effects (at the simulator level only).
SLIDE 45 Random Access Memory (RAM)
load (0 to n-1)
Direct Access Logic register 0 register 1 register n-1
RAM n . . .
register 2
in
(word) (word) address
SLIDE 46 RAM interface
address load
in
16 bits log 2 n bits
RAMn
16 bits
SLIDE 47
Book’s Implementation of RAM8 This is not how RAM is actually implemented
SLIDE 48 How Static Ram is Implemented
- Approx. 10 transistors per cell
SLIDE 49
How Dynamic Ram is Implemented
Static RAM Cell Dynamic RAM Cell
1 transistor/cell + R/W logic
SLIDE 50 What is Tri-State?
The idea is to use two transistors to isolate the output so that multiple
- utputs can be tied together (as long as only one is enabled at a time).
If both transistors are off, the output is in a disconnected (actually, a high-impedance) state.
SLIDE 51 Conceptual TTL Tri-State Implementation (no bias considerations)
VCC Out- Enable- In
SLIDE 52 Bit Bit
Register
.. .
Bit
The Book’s RAM Hierarchy
register
RAM 8 8
register
. . .
register
RAM8 RAM 64 8
. . .
RAM8
. . .
Recursive ascent
For our projects, we will use multiplexors and demultiplexors to manage RAM input and output
SLIDE 53 Counter
Typical function: program counter Implementation: register chip + some combinatorial logic. PC (counter)
w bits
in
w bits
inc load reset
If reset(t-1) then out(t)=0 else if load(t-1) then out(t)=in(t-1) else if inc(t-1) then out(t)=out(t-1)+1 else out(t)=out(t-1)
Needed: a storage device that can: (a) set its state to some base value (LOAD) (b) increment the state in every clock cycle (INC) (c) maintain its state (stop incrementing) over clock cycles (NOT LOAD, INC, or RST) (d) reset its state (RESET)
SLIDE 54
Implementing the Book’s PC Counter
SLIDE 55 Time Matters
Implications: Challenge: propagation delays Solution: clock synchronization Cycle length and processing speed In actual designs, DFFs are used to synchronize asynchronous inputs. This is a complex subject.
+
Reg2
a
Reg1
b
sel
clock signal
cycle cycle cycle cycle
tick tock tick tock tick tock tick tock
During a tick-tock cycle, the internal states of all the clocked chips are allowed to change, but their outputs are “latched” At the beginning of the next cycle, the outputs of all the clocked chips in the architecture commit to the new values.
SLIDE 56 Perspective
All the memory units described in this lecture are standard (but the described implementation is not). Typical memory hierarchy SRAM (“static”), typically used for the cache DRAM (“dynamic”), typically used for main memory Disk (used for non-volatile long-term storage) (Elaborate caching / paging algorithms) Flip-flops can be built from NAND gates Real memory devices (like RAM) are highly optimized, using a great variety of storage technologies. For example, Dynamic RAM requires only one CMOS transistor (and some capacitance) per bit of memory (neglecting control logic). Access time Cost