Serial Parallel Multiplier Design in Quantum-dot Cellular Automata - - PowerPoint PPT Presentation

serial parallel multiplier design in quantum dot cellular
SMART_READER_LITE
LIVE PREVIEW

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata - - PowerPoint PPT Presentation

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata Heumpil Cho and Earl E. Swartzlander, Jr. Application Specific Processor Group Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX


slide-1
SLIDE 1

1

Application Specific Processor Group ARITH18

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata

Heumpil Cho and Earl E. Swartzlander, Jr. Application Specific Processor Group Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 78712 USA

slide-2
SLIDE 2

2

Application Specific Processor Group ARITH18

Outline

  • Motivation
  • Quantum-dot Cellular Automata
  • Serial Multiplier Designs in QCA
  • Conclusions
slide-3
SLIDE 3

3

Application Specific Processor Group ARITH18

Motivation

Quantum-dot Cellular Automata (QCA)

  • Alternative to Transistor Technologies

– Avoids High Power Consumption Due to Leakage Currents

  • Quantum-dot Cellular Automata (QCA)

– Emerging Nanotechnology for Electronic Circuits

  • Introduced in 1993
  • Freedom from Complicated Physics

– High Density, Low Power Consumption, and Fast – Some Experimental Devices Have Been Created

  • Several QCA Circuits Have Been Proposed

– Ripple Carry Adders, Barrel Shifters, and Memories – Complex Designs Rare

  • Key Characteristics

– Inverters and 3-Input Majority Gates – Interconnect Consumes Time and Space

  • Difficult to Estimate Timing Until the Layout is Done

– Signal Synchronization and Refresh

  • Wires Act as Latches
  • Best for Pipeline Architectures Without Feedback
slide-4
SLIDE 4

4

Application Specific Processor Group ARITH18

QCA Technology

Basic Quantum-dot Cell

Electron Quantom-dot P=+1 (Binary 1) P=-1 (Binary 0) Tunnel Junction Regular cells

  • Square Nanostructure

– Each Cell

  • Has Four Quantum Dots
  • Can Possess a Single Electron per Dot
  • Charged With Two Electrons

– Two Polarizations are Possible by Coulombic Repulsion

slide-5
SLIDE 5

5

Application Specific Processor Group ARITH18

QCA Technology

Signal Propagation

  • Main Roles of Cells

– Computation, Storage, and Communication – Wire Dominant Design

  • Series of QCA Cells

– Act Like a Wire – Propagate the Signal

  • Clock Zones

– Four Clock Phases – Control the Signal Flow – Cells Are Refreshed on Every Cycle

slide-6
SLIDE 6

6

Application Specific Processor Group ARITH18

QCA Technology

Fundamental Gates

  • Inverter

– Conventional Gate

  • 3-Input Majority Gate
  • 2-Input AND/OR Gate

– Implemented by Setting One Input to a Constant

( , , ) M a b c a b b c c a = + + ( , ,0) ( , ,1) a b M a b a b M a b

  • =

+ =

slide-7
SLIDE 7

7

Application Specific Processor Group ARITH18

QCA Technology

Multi-Layer Wire Crossovers

  • Use Several Layers for Crossovers

– 3D Structure

  • Pros and Cons

+ Area Efficient Design – Manufacturability Issues

Layout Structure

1 1

slide-8
SLIDE 8

8

Application Specific Processor Group ARITH18

QCA Technology

QCA Circuit Design Rules

  • Cell Size

– Set at 20nm

  • Width and Height: 18nm
  • Quantum-dot Diameter: 5nm
  • Cell Center-to-Center Spacing: 20nm
  • Size Limit on Cells Per Clock Zone

– Limit: 15 Cells

  • Proper Propagation Delay and Reliable Signal Transmission
  • Freedom for Routing and a Reasonable Clock Zone Size
  • Minimum Separation of Two Different Wires

– The Width of Two Cells

  • Clock Increment Rule

– Increment the Clock Zone at the Clock Arrangement Positions

slide-9
SLIDE 9

9

Application Specific Processor Group ARITH18

Multiplier Design

Serial Parallel Multiplier Designs in QCA

  • Multiplier Design Issues

– A Parallel Multiplier is a Very Complex Circuit – Complex Circuits Often Incur Significant Delay

  • Simple Structure is Desirable

– Serial-Parallel Multiplier Design Selected – Filter Design Methodology is Used

slide-10
SLIDE 10

10

Application Specific Processor Group ARITH18

Multiplier Design

Algorithmic Design

  • FIR Filter Design Example

– Delay Operator: Z-1 – Filter Equation

Z-1 Z-1 b1 b0 xi yi bN-1 bN-2 … … … + + +

1 1 2 2 2 2 1 1 1 1 1 i i i i N i N N i N N N N k k k i k k i k i k k k

y b x b x b x b x b x b x b Z x b Z x

  • +
  • +
  • =

= =

= + + + + +

  • =

= =

  • L

1 1 1 1, n n i i

Z x x Z Z Z

  • +
  • =

=

slide-11
SLIDE 11

11

Application Specific Processor Group ARITH18

Multiplier Design

Pipelined FIR Filter Network

  • Pipelined FIR Filter Output (Right-to-Left Structure)

b1 b0 xi Z-N/2yi bN-1 bN-2 … … … + + + Z-1/2 Z-3/2 Z-1/2 Z-3/2 Z-1/2

( )

1 3 1 3 1 ( 1) ( 2) 2 2 2 2 2 1 2 ( 1) ( 2) 2 2 2 1 2 1 2 2 N N N N i N N N N N N i N i i N N k k i k N i

Z b Z Z b Z Z b Z x Z b Z x Z b Z x Z b x Z b Z x Z y

  • =
  • =

+ + +

  • =

+ + +

  • =
  • =
  • L

L

slide-12
SLIDE 12

12

Application Specific Processor Group ARITH18

Multiplier Design

Redirected FIR Filter Network

  • General Structure (Right-to-Right Structure)
  • Pipelined Structure

Z-1 Z-1 b1 b0 xi yi bN-1 bN-2 … … … + + + 1 1 1 2 2 1 1 2 2 2 1 1 2 2 2 N k i k i k k k N k i k k k N k i k

Z y Z b Z x Z b Z Z x Z Z b Z x

  • =
  • =
  • =
  • =
  • =
  • =
  • b1

b0 xi bN-1 bN-2 … … … + + + Z-1/2yi Z-1/2 Z-1/2 Z-1/2 Z-1/2 Z-1/2

slide-13
SLIDE 13

13

Application Specific Processor Group ARITH18

Multiplier Design

Multiplication Signal Flows

  • Unsigned Number Multiplication

– Bit Product Matrix and Signal Flow of Right-to-Left Structure – Bit Product Matrix and Signal Flow of Right-to-Right Structure

a3b3 a2b3 a3b2 a2b2 a3b1 a2b1 a3b0 a2b0 a1b3 a1b2 a0b2 a0b0 a0b1 a0b3 a1b1 a1b0 b3 b2 b0 b1 a3 a2 a0 a1 X p4 p3 p0 p1 p7 p6 p5 p2

p

4

p3 p p1 p7 p6 p5 p2 a3b3 a2b3 a3b2 a2b2 a3b1 a2b1 a3b0 a2b0 a1b3 a1b2 a0b2 a0b0 a0b1 a0b3 a1b1 a1b0 t4 t3 t0 t1 t7 t6 t5 t2 t12 t11 t8 t9 t13 t10 t-3 t-2 t-1 p7

Time S t e p a3b3 a2b3 a3b2 a2b2 a3b1 a2b1 a3b0 a2b0 a1b3 a1b2 a0b2 a0b0 a0b1 a0b3 a1b1 a1b0 b3 b2 b0 b1 a3 a2 a0 a1 X p4 p3 p0 p1 p7 p6 p5 p2

p4 p3 p7 p6 p5 p2 p0 p1 a3b3 a2b3 a3b2 a2b2 a3b1 a

2b1

a

3b0

a2b0 a1b3 a

1b2

a0b2 a

0b3

a1b1 a0b0 a0b1 a1b0 t4 t3 t0 t1 t7 t6 t5 t2 t12 t11 t8 t9 t10 t-3 t-2 t-1

S t e p Time

slide-14
SLIDE 14

14

Application Specific Processor Group ARITH18

Multiplier Design

Network Diagrams-I

  • Right-to-Left Carry Shift Multiplication (CSM)
  • Right-to-Left Carry Delay Multiplication (CDM)

b1 b0 ai bN-1 bN-2 … … … + + + Z-N/2pi Z-1/2 Z-3/2 Z-3/2 Z-1 Z-1/2 Z-1 Z-1 Z-1/2 b1 b0 ai bN-1 bN-2 … … … + + + Z-3N/4pi + Z-3/4 Z-3/4 Z-3/4 Z-1 Z-7/4 Z-7/4 Z-1/4 Z-1/4

slide-15
SLIDE 15

15

Application Specific Processor Group ARITH18

Multiplier Design

Network Diagrams-II

  • Right-to-Right Carry Shift Multiplication (CSM)
  • Right-to-Right Carry Delay Multiplication (CDM)

b1 b0 ai bN-1 bN-2 … … … + + + Z-1/2pi Z-1/2 Z-1/2 Z-1/2 Z-1 Z-1 Z-1/2 Z-1 Z-1/2 b1 b0 ai bN-1 bN-2 … … … + + + Z-3/4pi + Z-1/4 Z-1 Z-3/4 Z-1/4 Z-3/4 Z-3/4 Z-1/4 Z-1/4

slide-16
SLIDE 16

16

Application Specific Processor Group ARITH18

Multiplier Design

QCA Multiplication Diagrams

  • Multiplication Networks for QCA

– One Clock Zone Delay: D-1 (D-4=Z-1)

  • Filter Network Transformation Using D Operators

D-6 D-6 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-2 D-2 D-2N-2pi D-2 D-1 D-2 D-2 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-2 D-2 D-4pi D-2 D-1

slide-17
SLIDE 17

17

Application Specific Processor Group ARITH18

Multiplier Design

Right-to-Left Networks

  • Equations

CSM Network CDM Network

D-7 D-7 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-3 D-3 D-3N-2pi D-3 D-1 + D-4 D-1 D-1 D-6 D-6 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-2 D-2 D-2N-2pi D-2 D-1 D-4 D-4 D-4

( ) ( ) ( ) ( )

7 2 3 1 ( 1) ( 1) 7 2 ( 3)( 1) ( 1)( 1) 6 2 2 4 ( 1) 6 2 ( 2)( 1) ( 4)

( , ) Addition , , Addition , , ( , ) Addition , , Addition , ,

j ij ij j i i j i j j i j i j i j j ij ij j i i j ij j i j i j i j

s c b D a D s D c b a s c s c b D a D s D c b a s c

  • +
  • +
  • =

= = =

slide-18
SLIDE 18

18

Application Specific Processor Group ARITH18

Multiplier Design

Right-to-Right Networks

  • Equations

CSM Network CDM Network

D-1 D-1 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-3 D-3 D-5pi D-4 D-1 + D-3 D-1 D-1 D-2 D-2 b1 b0 ai bN-1 bN-2 … … … + + + D-1 D-1 D-1 D-1 D-2 D-2 D-4pi D-2 D-1 D-4 D-4 D-4

( ) ( ) ( ) ( )

2 3 1 ( 1) ( 1) 2 ( 3)( 1) ( 1)( 1) 2 2 2 4 ( 1) 2 2 ( 2)( 1) ( 4)

( , ) Addition , , Addition , , ( , ) Addition , , Addition , ,

j ij ij j i i j i j j i j i j i j j ij ij j i i j ij j i j i j i j

s c b D a D s D c b a s c s c b D a D s D c b a s c

  • +
  • +
  • +
  • +
  • =

= = =

slide-19
SLIDE 19

19

Application Specific Processor Group ARITH18

Multiplier Design

Multiplier Block Diagrams

  • Nominal Design Modified Design

– Right-to-Right CSM – Right-to-Right CDM

FA FA FA … ai (serial-in) pi (serial-out) bN-1 bN-2 b1 b0 FA FA FA FA … ai (serial-in) pi (serial-out) bN-1 bN-2 b1 b0 FA FA FA … ai (serial-in) pi (serial-out) bN-1 bN-2 b1 b0 FA FA … ai (serial-in) pi (serial-out) bN-1 bN-2 b1 b0 FA FA

slide-20
SLIDE 20

20

Application Specific Processor Group ARITH18

Multiplier Design

4-bit and 32-bit CSM Layouts

slide-21
SLIDE 21

21

Application Specific Processor Group ARITH18

Multiplier Design

4-bit and 32-bit CDM Layouts

slide-22
SLIDE 22

22

Application Specific Processor Group ARITH18

Multiplier Design

4-bit CSM and CDM Simulation Results

  • CSM Results
  • CDM Results
slide-23
SLIDE 23

23

Application Specific Processor Group ARITH18

Multiplier Design

Carry Shift Multipliers

1.25 clocks 14.3 µm x 0.85 µm 9,579 cells CSM-64 1.25 clocks 7.24 µm x 0.67 µm 4,299 cells CSM-32 1.25 clocks 3.67 µm x 0.61 µm 2,043 cells CSM-16 1.25 clocks 1.93 µm x 0.61 µm 1,011 cells CSM-8 1.25 clocks 1.04 µm x 0.61 µm 507 cells CSM-4 Latency Area Complexity Size

slide-24
SLIDE 24

24

Application Specific Processor Group ARITH18

Multiplier Design

Carry Delay Multipliers

1 clock 16.8 µm x 0.95 µm 11,264 cells CDM-64 1 clock 8.47 µm x 0.65 µm 4,575 cells CDM-32 1 clock 4.19 µm x 0.47 µm 1,999 cells CDM-16 1 clock 2.12 µm x 0.47 µm 903 cells CDM-8 1 clock 1.05 µm x 0.47 µm 406 cells CDM-4 Latency Area Complexity Size

slide-25
SLIDE 25

25

Application Specific Processor Group ARITH18

Multiplier Design

Comparison of Different Multipliers

  • Multiplier Comparisons

– Delay: CSM=1.25 clocks, CDM=1 clock

Complexity

2000 4000 6000 8000 10000 12000 4 8 16 32 64 Word size (bits) Number of QCA cells .

CDM CSM Area

2 4 6 8 10 12 14 16 18 4 8 16 32 64 Word size (bits) Size (um^2) .

CDM CSM

slide-26
SLIDE 26

26

Application Specific Processor Group ARITH18

Conclusions

  • Summary

– Serial Parallel Multiplier Architecture

  • Simple Structure Chosen for Wire Delay Minimization
  • Regular Cells for Design Reuse
  • Optimized to Minimize Latency

– Serial Parallel Multiplication Network

  • Based on Filter Network Example
  • Derived the Equations and Network Graph

– Designed Two Serial Parallel Multipliers

  • Carry Shift Multiplier and Carry Delay Multiplier
  • Contributions

– Extended QCA Circuit Designs to Multiplication – Explored Various Serial Parallel Multiplication Algorithms

slide-27
SLIDE 27

27

Application Specific Processor Group ARITH18

QUESTIONS???