Low-Cost 3D Chip Stacking with ThruChip Wireless Connections - - PowerPoint PPT Presentation

low cost 3d chip stacking with thruchip wireless
SMART_READER_LITE
LIVE PREVIEW

Low-Cost 3D Chip Stacking with ThruChip Wireless Connections - - PowerPoint PPT Presentation

Low-Cost 3D Chip Stacking with ThruChip Wireless Connections Dave.Ditzel@ThruChip.com Tadahiro.Kuroda@ThruChip.com ThruChip Communications October 24, 2014 Stanford EE Computer Systems Colloquium Credit to Professor Tadahiro Kuroda of Keio


slide-1
SLIDE 1

Low-Cost 3D Chip Stacking with ThruChip Wireless Connections

Dave.Ditzel@ThruChip.com Tadahiro.Kuroda@ThruChip.com ThruChip Communications October 24, 2014 Stanford EE Computer Systems Colloquium

slide-2
SLIDE 2

2

ThruChip Wireless 3D Stacking October 24, 2014

Credit to Professor Tadahiro Kuroda of Keio University

Prof Kuroda leads

  • ne of the world’s

top circuit labs at Keio University.

Most of the ideas in this talk are from more than a decade of work investigating near-field inductive coupling for 3D stacking by professor Tadahiro Kuroda of Keio University and his students. Kuroda founded ThruChip in 2008, and as ThruChip’s CTO, is helping companies develop lower cost 3D chip stacking. ThruChip provides design information and licensing of professor Kuroda’s inventions.

Tadahiro.Kuroda@ThruChip.com

slide-3
SLIDE 3

3

ThruChip Wireless 3D Stacking October 24, 2014

Wireless 3D stacking

 Current 3D stacking methods have challenges

 Main challenge is the high cost of Thru Silicon Vias

 Wireless is a better approach for stacking

 Lower cost, lower power, higher bandwidth  Less costly if we can avoid having to add vertical wires

 Cost reduction possible, instead of increase, with:

 Advances in wafer thinning  Wireless data communication between stacked die  Lower-cost power distribution from front to back of die

slide-4
SLIDE 4

4

ThruChip Wireless 3D Stacking October 24, 2014

Challenges with current 3D stacking

slide-5
SLIDE 5

5

ThruChip Wireless 3D Stacking October 24, 2014

3D Stacking with Wire Bonds

Staircase stacking constrains wire bond access to one side of each die.

spacer

Cons:  High wire bond inductance  Higher power IO  Bandwidth limited to a few GHz  Staircase stacking constraints

 Limited number of bond wires  Underside clearance limits die thinness

Pros:  Low Cost  Good yield  Allows ~50m thin die  Existing infrastructure

slide-6
SLIDE 6

6

ThruChip Wireless 3D Stacking October 24, 2014

Wire bonding: Pretty example

Akita Elpida wire bond example of 20 stacked die(40u pitch)

slide-7
SLIDE 7

7

ThruChip Wireless 3D Stacking October 24, 2014

Wire bonding: Not so pretty

slide-8
SLIDE 8

8

ThruChip Wireless 3D Stacking October 24, 2014

3D Stacking with Thru Silicon Vias (TSV)

Cons:  High Cost (1.4x - 2x) over bare die  Requires new CMOS process  Yield reductions from bumps  Area impact from TSV & KOZ  Effects on nearby transistors Pros:  ~10x lower power IO  Thousands of IO possible

slide-9
SLIDE 9

9

ThruChip Wireless 3D Stacking October 24, 2014

Proposal for lower cost 3D stacking

 Separate Data Communication from Power Distribution  Data Communication: Use wireless near-field inductive coupling

 Uses simple CMOS digital circuits: No new semiconductor process expense  Provides best in class inter-die power and bandwidth  May reduce chip cost if IO area can be reduced  Well understood technology validated with dozens of test chips  Becomes more compelling as die get thinner

 Power Distribution: Many options available when wireless used for data

 Wire bond – Low cost, in high volume production  TAB – Low cost, in high volume production  RDL/FOWLP – Medium cost, production ready  TSV – High cost, early production  Recommend Highly Doped Silicon Vias – New lowest cost proposal, discussed later

slide-10
SLIDE 10

10

ThruChip Wireless 3D Stacking October 24, 2014

NAND goal is to go

spacer

~1000 m

From this To this

Example NAND FLASH NAND FLASH # stacked die 16 16 Die pitch 50 m 5 m Total height ~1000 m ~80 m Die area 1x ~0.9x Data communication wire bond wireless Power delivery wire bond wireless (no metal) IO energy/bit 1x < 1/400x

~80 m

slide-11
SLIDE 11

11

ThruChip Wireless 3D Stacking October 24, 2014

DRAM goal is to go

~275 m

From this To this

Example DRAM with TSV DRAM # stacked die 5 5 Die pitch 55m 8m Total height ~275m ~40m Die area 1x 0.87x Data communication TSV wireless Power delivery TSV wireless (no metal vias) IO energy/bit 1x < 1/10x

~40 m

DRAM die DRAM die DRAM die DRAM die Base logic die

slide-12
SLIDE 12

12

ThruChip Wireless 3D Stacking October 24, 2014

Relevant advances in wafer thinning

slide-13
SLIDE 13

13

ThruChip Wireless 3D Stacking October 24, 2014

Ultra-Thin 4m wafer breakthrough

 Wafer thinning has been stuck at ~40m due to “Gettering problem”

 Barrier was due in part to loss of the “gettering effect” at smaller dimensions when performing back grinding, causing impurities affecting device performance (particularly leakage) and yield.

 DISCO Corporation solution can now thin to a few microns

 DISCO introduced a “Gettering Dry Polish” wheel which forms gettering sites while grinding, allowing thinning of wafer silicon to a few microns without device damage. [35]

 Example: DRAM silicon thinned to 4 microns

 See “Ultra Thinning down to 4mm using 300-mm Wafer proven by 40-nm Node 2 Gb DRAM for 3D Multi-stack WOW Applications.”[36] They concluded “No degradation in terms of retention characteristics and distribution employing 2 Gb DRAM wafer was found after ultra-thinning.”

Ultra-thin wafers can be handled (from DISCO website)

[Reference 36]

2Gb DRAM thinned to 4 microns

slide-14
SLIDE 14

14

ThruChip Wireless 3D Stacking October 24, 2014

Wireless 3D data

slide-15
SLIDE 15

15

ThruChip Wireless 3D Stacking October 24, 2014

Wireless Near-Field Inductive Coupling

 Chip designers often spend a lot of time making sure they do not have too much coupling between adjacent wires.  Idea: Turn that coupling into an advantage.  Use Inductive Coupling for 3D wireless data communication

 Inductive coils made with a few turns in standard metal layers  Coil diameter is about 3x the communication distance  Coils communicate vertically to adjacent chips by magnetic field  Receive and transmit coils can be placed concentrically on each die to form a transceiver  Multiple coils used to increased bandwidth  Bandwidth improves with Moore’s law improvement in devices

slide-16
SLIDE 16

16

ThruChip Wireless 3D Stacking October 24, 2014

Receiver Coil

Communication is via magnetic field

Magnetic field can pass through silicon, including over active circuitry.

dIT dt VR=k LTLR

Can easily induce a 200 mV signal in receiver coil.

slide-17
SLIDE 17

17

ThruChip Wireless 3D Stacking October 24, 2014

ThruChip Interface (TCI)

Txdata Time IT VR Rxdata Rxdata Txdata Txdata Rxdata

 Simple transmitter and receiver circuits (basic form shown)  Standard digital CMOS: Scales with Moore’s Law  Bandwidth: >40 Gigabits/second/coil with modern digital CMOS  Delay: About 7 equivalent logic gates (NAND2 FO4)  Energy: About 80 equivalent gates

Chip 1 Chip n

TCI Transmitter TCI Receiver

Transmitter Coil Receiver Coil

slide-18
SLIDE 18

18

ThruChip Wireless 3D Stacking October 24, 2014

TCI coil example

3 chips with staircase stacking TCI Wireless Transceiver

200 m

4 turns xmitter 4 turns receiver

slide-19
SLIDE 19

19

ThruChip Wireless 3D Stacking October 24, 2014

5 10 15 20 25 30 35 64 100 150

Communication Distance, Z [mm] Usable Coil Bandwidth [Gb/s]

Usable circuit bandwidth depends on device

40 45 50 55 60 65 70 75

TCI bandwidth vs communication distance

9 die stacking D=200m Z=64m D=200m

D=300m D=400m D=500m

D=100m

Usable BW of 28 Gbps

5-die stacking D=100m Z=32m

Usable BW of 66 Gbps

32 Coil diameter D=3 x Z

Assumes 8m die pitch

slide-20
SLIDE 20

20

ThruChip Wireless 3D Stacking October 24, 2014 Data from references [16,25,28]

 High BW: Data rate is equivalent to 1.5x of 5-stage ring oscillator  Fast: Delay is equivalent to 7x of 2NAND FO4  Low Power: Energy is equivalent to 80x of 2NAND FO4  Small: Circuit layout area is equivalent to 36x 2NAND

10 100 1000 10000

Delay [ps]

180 90 45 32

Process [nm CMOS]

1 65

7x

0.01 0.1 1 10

Energy Dissipation [pJ/b]

180 90 45 32

Process [nm CMOS]

0.001 65

80x

1 10 100 180 90 45 32

Process [nm CMOS]

65

Data Rate, Frequency [Gb/s] 1.5x

TCI scales with digital CMOS

=Measured silicon data =Simulated data

slide-21
SLIDE 21

21

ThruChip Wireless 3D Stacking October 24, 2014

Node TCI 2 Coils TSV Wire bond 32nm 0.40 pJ/b 0.35 pJ/b 3.45 pJ/b 22nm 0.20 pJ/b 0.30 pJ/b 3.35 pJ/b 16nm 0.10 pJ/b 0.28 pJ/b 3.30 pJ/b 11nm 0.05 pJ/b 0.26 pJ/b 3.27 pJ/b

 Pin-to-Pin data transfer  Bus data transfer (8 memory chips + 1 SoC)

Node TCI 9 coils TSV Wire bond 32nm 0.40 pJ/b 2.45 pJ/b 24.15 pJ/b 22nm 0.20 pJ/b 2.10 pJ/b 23.45 pJ/b 16nm 0.10 pJ/b 1.96 pJ/b 23.10 pJ/b 11nm 0.05 pJ/b 1.82 pJ/b 22.89 pJ/b

TCI energy will be >450x lower than wire bond, >36x lower than TSV by 11nm.

Energy per Bit becomes very compelling

TCI energy will be >65x lower than wire bond, >5x lower than TSV by 11nm.

slide-22
SLIDE 22

22

ThruChip Wireless 3D Stacking October 24, 2014

Constant Magnetic Field Scaling.

evaluation value dimension scaling Device size [x] 1/a Voltage [V] 1/a Current [I] 1/a Capacitance [C]~[xx/x] 1/a Delay time [t]~[CV/I] 1/a Chip thickness [z] 1/z Coil size [D] 1/z Coil turn number [n] z0.8 Inductance [L]~[n2D1.6] 1 Magnetic coupling [k]~[z/D] 1 Received signal [vR]~[kL(I/t)] 1 Data rate / channel [1/t] a Channel / area [1/D2] z2 Data rate / area [1/tD2] az2 Area / data rate [tD2] 1/az2 Energy / bit [IVt] 1/a3

Diameter:1/z Turn:z0.8 Thickness:1/z Voltage:1/a Size:1/a

Constant Electric Field Scaling for FET Constant Magnetic Field Scaling for TCI

slide-23
SLIDE 23

23

ThruChip Wireless 3D Stacking October 24, 2014

Transmission power, delay Number of Stacked Chips TSV TCI

Chip4 IO Chip3 Chip2 Chip1 Interface

TSV TCI

Tx Rx

Tx Rx

Tx Rx Tx Rx

Tx Rx

IO IO IO IO ESD TSV

TCI broadcasting more efficient than TSV

 TSV power and delay is increased in proportion to # of stacked chips.  TCI transmitter consumes constant power and delay.

slide-24
SLIDE 24

24

ThruChip Wireless 3D Stacking October 24, 2014

Received signal rapidly decays in the near field (at distance X > D/2). Crosstalk is sufficiently suppressed. Ref [07],[10],[11],[27]

Distance x Diameter D = 0.2mm Coils f = 1GHz

Far Field  1/x

  • 20dB/dec

Signal Crosstalk D/3~D/2

l/2p

Near Field VRX  1/x3

  • 60dB/dec

0.01 0.1 1 10 100 1000

  • 240
  • 200
  • 160
  • 120
  • 80
  • 40

Received signal strength (a.u.) Distance(mm)

Crosstalk decays rapidly

slide-25
SLIDE 25

25

ThruChip Wireless 3D Stacking October 24, 2014

Channel Pitch vs. Crosstalk

Ref [03]

10-1 1 1 10 10-2 10-3 10-4 2 3 4 6 8

Normalized Channel Pitch Y/D Y D =3X Crosstalk-to-Signal Ratio [dB] Z

  • 50
  • 45
  • 40
  • 35
  • 30
  • 25
  • 20
  • 15
  • 10
  • 5

1 2 3 4 5

Normalized Channel Pitch Y/D Crosstalk-to-Signal Ratio [dB]

Line Array

D =3X YLine =D~2D Y Array=2D~3D

Line Array

slide-26
SLIDE 26

26

ThruChip Wireless 3D Stacking October 24, 2014

Quadrature Phase Division Multiplexing (QPDM)

(a) Conventional TCI coil spacing (b) Overlapping TCI coils

q =

p/2

p 3p/2 0

p/2

p 3p/2 CLK0 CLKp/2 CLKp CLK3p/2

TCI Coils can be overlapped with QPDM

Area efficiency is improved by 4 times with overlapping coils 1 D coil spacing avoids crosstalk Can pack coils 4x denser with QPDM Receiver circuits disable

  • ut-of-phase channels to further

improve noise immunity[37].

D D D D D

slide-27
SLIDE 27

27

ThruChip Wireless 3D Stacking October 24, 2014 Reference: A 0.55v 10 fJ/bit Inductive coupling Data Link with Dual Coil Transmission Scheme, IEEE JSSC, April 2011.

Demonstrated lowest die-to-die energy: 10 fJ/bit

Supply Voltage VDD [V] Bit Error Rate (BER) Energy Dissipation [fJ/bit] Data Rate=1.1Gb/s

10-12 10-9 10-6 10-3 1 0.5 0.7 0.9 1.1 1.3 10 20 30 40 50 60

BE R Energy Dissipation 10fJ/b @ 0.55V

“Dual coil TCI” Lowest Energy/bit 65nm CMOS

slide-28
SLIDE 28

28

ThruChip Wireless 3D Stacking October 24, 2014

Compatible with Conventional EDA

Routing Wires (M4, M6) Coil Wires (M5) Power lines (M4, M6) 7mm 7mm Routing Blockage (M5,M3) Routing Blockage (M6,M4) (M4 – M6 shown) TCI Tx/Rx Clock Link Data Link Coil Wires (M6) IP Module Tx/Rx Routing Wires (M5)

Ref [31]

slide-29
SLIDE 29

29

ThruChip Wireless 3D Stacking October 24, 2014

TCI has High Reliability

Horn Antenna TCI Electric Field Sensor Stacked Memory Chips

Operating Frequency=8GHz RMS Jitter=6ps (<5% UI) Jitter Histogram Transmitter Clock Recovered Clock Operating Frequency=8GHz RMS Jitter=6ps (<5% UI) Jitter Histogram Transmitter Clock Recovered Clock

[01] ISSCC’04 [12] A-SSCC’07 [26] ISSCC’10 [15] SSDM’08 [03] CICC’04 [30] A-SSCC’09 [12] A-SSCC’07 [22] SSDM’09 [22] SSDM’09 [05] SSDM’05

 Small Bit Error Rate < 10-14

as reliable as wireline

 Small jitter < 5% UI  Small degradation

by eddy current in substrate by eddy current in power mesh by eddy current in bit/word lines by chip misalignment

 Small inter-channel crosstalk

when pitch > 2*diameter

 No Interference

from digital to SRAM from environment (EMS) to environment (EMI)

slide-30
SLIDE 30

30

ThruChip Wireless 3D Stacking October 24, 2014

Channel 0 Channel 1 Channel N Tx0 Rx0 Tx1 Rx1 TxN RxN Testin Testout Txdata0 Txdata1 TxdataN Rxdata0 Rxdata1 RxdataN mode selector selector selector Coil(Tx) Coil(Rx)

Compatible with Conventional Testing

 Although wide coil line/spacing and small transceiver circuits will have zero impact on yield, wafer-level testing is also possible.

Ref [24]

slide-31
SLIDE 31

31

ThruChip Wireless 3D Stacking October 24, 2014

RF EMI from clock line to RF TCI Clock line EMI from TCI to RF TCI Signal EMS from RF to TCI EMS

TCI EM Compatibility

 EMI to RF: Magnetic field generated by TCI is only 0.0001% of that by clock lines.  EMS from RF: SNR is 200, good enough for a receiver with hysteresis comparator  EMS from environment: yields small discrepancy in VDDmin

slide-32
SLIDE 32

32

ThruChip Wireless 3D Stacking October 24, 2014

±10% alignment error can be compensated by 5% power increase.

10 20 40

  • 10
  • 20
  • 30

Misalignment, X/D, Y/D [%] Normalized Received Signal 0.25 0.50 0.75 1.00 ±10%

30

  • 40

Misalignment Tolerant

 TCI tolerates alignment error in chip stacking today.  TSV requires much fine alignment control as the size is 1/10.

Ref [15]

D =120mm Z =40mm

X, Y

slide-33
SLIDE 33

33

ThruChip Wireless 3D Stacking October 24, 2014

TCI demonstrated with 28 test chips

128-die stacking

 High Integration

11Gb/s/ch

(180nm)

8Tb/s

(180nm,1000ch)

30Gb/s/ch

(65nm)

 High Speed

0.01pJ/b

(65nm)

 Low Power  CPU/Memory  4x coil density

Overlapped coils with QPDM

CPU2 CPU0 CPU4 CPU6 CPU3 CPU1 CPU5 CPU7

System Bus

CPU2 CPU0 CPU4 CPU6 CPU3 CPU1 CPU5 CPU7

System Bus

TCI

1 MB SRAM

TCI

 High Bandwidth

[26] [37] [17,18] [25] [39] [13] [38] (90nm) (65nm) (180nm)

slide-34
SLIDE 34

34

ThruChip Wireless 3D Stacking October 24, 2014

ThruChip introduces Highly Doped Silicon Vias (HDSV) for “Wireless” Power Delivery

slide-35
SLIDE 35

35

ThruChip Wireless 3D Stacking October 24, 2014

HDSV: A new way to deliver power

 Ultra-thin wafers make inductive coupling for data very compelling  Ultra-thin wafers are key to a novel mechanism for power delivery  At <10m thickness can create power vias by highly doping the silicon  With high levels of doping, silicon regions are conductive like metal  Can pattern front-to-back conductive regions with an ion implant mask  P+ and N+ doping increased by ~10-100x in desired regions  Can be done with standard fab equipment  Low cost step, less expensive than wire bonds  Let’s look at an example of Highly Doped Silicon Vias (HDSV)

slide-36
SLIDE 36

36

ThruChip Wireless 3D Stacking October 24, 2014

Start with standard wafer

~700 m

Then add transistors and metal normally, metal caps on HDSV Thin silicon to ~4 microns

~4 m

Add implants to create highly doped regions for power vias

Highly Doped Silicon Vias for power distribution

slide-37
SLIDE 37

37

ThruChip Wireless 3D Stacking October 24, 2014

Highly Doped Silicon Vias for power distribution

A deeper than normal, and more highly doped well is used to make a low resistance HDSV pathway directly through the thinned wafer using the silicon itself.

VDD VSS P-sub N+ N-sub P+ N-well Conventional Device P-well

< 10m

HDSV

P++ Well N++ Well

HDSV

The HDSV on one die and the electrodes on the next die are connected by pressure from a Room-Temperature Wafer Level Bonding machine (solid intermetallic bonding by diffusion) to create larger stacks.

Electrode(VSS) Electrode(VDD) VDD VSS P++ Well N++ Well P-sub N+ N-sub P+ N-well Conventional Device P-well

HDSV HDSV

slide-38
SLIDE 38

38

ThruChip Wireless 3D Stacking October 24, 2014

Electrode(VSS) VDD VSS P-sub N+ N-sub P+ N-well Conventional Device P-well VDD VSS P-sub N+ N-sub P+ N-well Conventional Device P-well Electrode(VDD) VDD VSS P-sub N+ N-sub P+ N-well Conventional Device P-well

Ground Power

Highly Doped Silicon Vias for power distribution

HDSV HDSV HDSV HDSV HDSV HDSV

slide-39
SLIDE 39

39

ThruChip Wireless 3D Stacking October 24, 2014

TCAD modeling: HDSV resistance

 Desire < 3 milliOhms front to back resistance for HDSV with 4m wafer thickness  Front-to-back resistance can be made sufficiently low for power distribution  Dose of 1x1016 can be done on conventional implant equipment (about 10x normal)  HDSV probably not usable for high speed data due to high capacitance, need TCI

Substrate Thickness (µm) Resistance ()

Phosphorus

11016 cm-2

Top Oxide: 10 nm

3 m

Al Contact L = 7 mm W = 100 µm 11017 cm-2 Dose:

5 10 15 10-5 10-4 10-3 10-2 10-1 100 101

Substrate Thickness (µm) Resistance ()

Boron

11016 cm-2

Top Oxide: 10 nm

3 m

Al Contact L = 7 mm W = 100 µm 11017 cm-2 Dose:

5 10 15 10-5 10-4 10-3 10-2 10-1 100 101

slide-40
SLIDE 40

40

ThruChip Wireless 3D Stacking October 24, 2014

HDSV Wireless Power Distribution

 No metallic TSV’s, no wire bonds, no solder bumps  Just stack chips and connect the stack to power  Very loose alignment requirements on both data and power  Data transmitted wirelessly with near field inductive coupling  Power and ground go directly through the silicon, by using high levels of doping on ultra-thin die.  Since silicon provides the power conduits instead of “metal wires”, the power distribution is “wireless” ;-)  HDSV should be low cost, extra implants are the only change to chips

slide-41
SLIDE 41

41

ThruChip Wireless 3D Stacking October 24, 2014

Comparison Example Stacked HBM DRAM TSV vs TCI/HDSV

slide-42
SLIDE 42

42

ThruChip Wireless 3D Stacking October 24, 2014

Example HBM DRAM with TSV

6.91 mm 5.1 mm

TSV’s provide 8 channels of independent 128-bit I/O Total of 1024 TSV I/O at 1 Gbps for 128 GB/s

This is a simplified hypothetical example using Hynix HBM as a point of comparison for stacking 5 die.

~ 18% of die area dedicated to TSV IO

slide-43
SLIDE 43

43

ThruChip Wireless 3D Stacking October 24, 2014

Replace TSV signals with TCI coils

7.5 coils x 100m = 750m 2.5 coils 250m

TCI coil layout for two of eight DRAM-channels

CLK F1 CLK F2 CLK F3 CLK F4

Each TCI coil is 100m x 100m Each TCI coil can run at 8 Gbps with slow DRAM transistors 26 coils/DRAM-channel provide the same bandwidth as HBM  16 coils for data x 8 Gbps/coil = 128 Gbps / DRAM-channel  8 coils for 64 address/control signals  2 coils for half of QPDM clocks (4 in a pair)

slide-44
SLIDE 44

44

ThruChip Wireless 3D Stacking October 24, 2014

Remove TSV section

0.907 mm 6.91 mm

slide-45
SLIDE 45

45

ThruChip Wireless 3D Stacking October 24, 2014

Add TCI IO and shrink die area

0.250 mm 6.91 mm

slide-46
SLIDE 46

46

ThruChip Wireless 3D Stacking October 24, 2014

Define Power/Ground with HDSV

Vss Vdd

These are the mask patterns for low resistance implants for HDSV conduits from the front to back side of each die.

slide-47
SLIDE 47

47

ThruChip Wireless 3D Stacking October 24, 2014

Add HDSV for Vdd/Vss

4.443 mm 6.91 mm

Original die size with TSV = 35.241 mm2 Die size with TCI & HDSV = 30.701 mm2 Area savings = 4.540 mm2, -13%

13% area reduction is a significant cost reduction.

slide-48
SLIDE 48

48

ThruChip Wireless 3D Stacking October 24, 2014

Final stack DRAM example

~40 m

Base die (face down) HBM DRAM with TCI HBM DRAM with TCI HBM DRAM with TCI HBM DRAM with TCI

Assumptions:

  • Each die 8m thick, 4m silicon and 4m metal stack.
  • Data sent wirelessly with TCI inductive coupling links.
  • Power passes through existing silicon with Highly Doped Silicon Vias.
  • Base die can translate to standard IO or TCI link to a SoC.
  • Smaller die size provides significant cost reduction.
  • Cost of implants for HDSV and circuits for TCI relatively negligible.
  • Seems likely this will result in a net cost reduction when using this stacking approach.
  • No vertical metal wires! Wireless 3D stacking.

Vss in HDSV Vdd in HDSV Vdd in HDSV Vss in HDSV

TCI Channels

slide-49
SLIDE 49

49

ThruChip Wireless 3D Stacking October 24, 2014

Ultra-Thin Lowest Cost 3D Packaging

 Panel-level stacking as batch (wafer scale) process

1) Known Good memory die (7.2mm x 7.2mm) placed face down on a support panel (465mm x 320mm) by the pitch of customer's chip size, mold is poured to the gap to form a memory panel by a memory vendor. 2) The memory panels are provided to an SoC vendor. 3) Known Good SoC die (8.3mm x 8.0mm) placed face down on a support panel (465mm x 320mm) by pitch of the SoC size (2240 chips in total), by the SoC vendor. 4) The SoC panel is then thinned from the back. 5) The memory panel is placed on top of the SoC panel, face down, bonded by RT pressure bonding machine. 6) The panel thinned from the back. 7) Repeat the process to build up memory 8-layer tower on the SoC panel.

Package

80 micron communication distance (90u high)

8 stacked memory dies SoC die Wireless Power delivery with Thru-Well-Vias (yellow) (implant change only)

mold mold

Wireless data delivery with With TCI coils (red)

slide-50
SLIDE 50

50

ThruChip Wireless 3D Stacking October 24, 2014

Technical Summary

 The synergy of ultra die thinning, TCI wireless data communication and Highly Doped Silicon Vias for power provides a future path for cost reduction using 3D stacking.  Wireless TCI near-field inductive coupling has been well proven with 28 silicon test chips.  Power distribution when using TCI can be done with proven techniques such as wire bond, TAB or even TSV.  Power distribution for TCI with Highly Doped Silicon Vias is a new and still untested technique, which offers great promise for lowering 3D stacking costs. Help us make it happen.

slide-51
SLIDE 51

51

ThruChip Wireless 3D Stacking October 24, 2014

[01] D. Mizoguchi, et al., “A 1.2Gb/s/pin Wireless Superconnect Based on Inductive Inter-chip Signaling (IIS),” ISSCC, pp.142-143, Feb. 2004. [02] N. Miura, et al., “Analysis and Design of Inductive Coupling and Transceiver Circuit for Inductive Inter-Chip Wireless Superconnect,” Symp. VLSI Circuits, pp. 246-249, Jun. 2004. [03] N. Miura, et al., “Cross Talk Countermeasures in Inductive Inter-Chip Wireless Superconnect,” CICC, pp.99-102, Oct. 2004. [04] N. Miura, et al., “A 195Gb/s 1.2W 3D-Stacked Inductive Inter-Chip Wireless Superconnect with Transmit Power Control Scheme,” ISSCC, pp.264-265, Feb. 2005. [05] D. Mizoguchi, et al., "Measurement of Inductive Coupling in Wireless Superconnect,” SSDM, pp.670-671, Sep. 2005. [06] N. Miura, et al., “A 1Tb/s 3W Inductive-Coupling Transceiver for Inter-Chip Clock and Data Link,” ISSCC, pp.424-425, Feb. 2006. [07] T. Kuroda, et al., “Perspective of Low-Power and High-Speed Wireless Inter-Chip Communications for SiP Integration,” ESSCIRC, pp.3-6, Sep. 2006. [08] D. Mizoguchi, et al., “Constant Magnetic Field Scaling in Inductive-Coupling Data Link,” SSDM, pp. 606–607, Sep. 2006. [09] N. Miura, et al., “A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping,” ISSCC, pp.264-265, Feb. 2007. [10] T. Kuroda, “CMOS Proximity Wireless Communications for SiP Integration (Invited),” ISSCC, Feb. 2007. [11] T. Kuroda, “Low power technology for system LSI,” J. IEICE,

  • vol. 90, no. 11, pp. 977-981, Nov. 2007.

[12] K. Niitsu, et al., “Interference from Power/Signal Lines and to SRAM Circuits in 65nm CMOS Inductive-Coupling Link,” A-SSCC, pp.131-134, Nov. 2007. [13] N. Miura, et al., “An 11Gb/s Inductive-Coupling Link with Burst Transmission,” ISSCC, pp.298-299, Feb. 2008. [14] D. Mizoguchi, et al., “Constant Magnetic Field Scaling in Inductive-Coupling Data Link,” IEICE Trans. Electronics, Vol. E91-C, No. 2, pp. 200- 205, Feb. 2008. [15] K. Niitsu, et al., “Misalignment Tolerance in Inductive-Coupling Inter-Chip Link for 3D System Integration,” SSDM, pp.86-87, Sep. 2008. [16] Y. Sugimori, et al., “A 2Gb/s 15pJ/b/chip Inductive-Coupling Programmable Bus for NAND Flash Memory Stacking,” ISSCC, pp.244-245, Feb. 2009. [17] K. Niitsu, et al., “An Inductive-Coupling Link for 3D Integration of a 90nm CMOS Processor and a 65nm CMOS SRAM,” ISSCC, pp.480-481, Feb. 2009. [18] K. Osada, et al., “3D System Integration of Processor and Multi-Stacked SRAMs by Using Inductive-Coupling Links,” Symp on VLSI Circuits, pp. 256-257, Jun. 2009. [19] Y. Kohama, et al., “A Scalable 3D Processor by Homogeneous Chip Stacking with Inductive-Coupling Link,” Symposium on VLSI Circuits, pp. 94-95, Jun. 2009. [20] S. Kawai, et al., “A 4.7Gb/s Inductive Coupling Interposer with Dual Mode Modem,” Symposium on VLSI Circuits, pp. 92-93, Jun. 2009. [21] M. Saito, et al., “47% Power Reduction and 91% Area Reduction in Inductive-Coupling Programmable Bus for NAND Flash Memory Stacking,” CICC, pp. 449-452, Sep. 2009. [22] K. Kasuga, et al., “Electromagnetic Interference and Susceptibility in Inductive-Coupling Link,” SSDM, pp.62-63, Nov. 2009. [23] M. Saito, et al., “ An Extended XY Coil for Noise Reduction in Inductive-coupling Link,” A-SSCC, pp.305-308, Nov. 2009. [24] K. Kasuga, et al., “A Wafer Test Method of Inductive-Coupling Link,” A-SSCC, pp.301-304, Nov. 2009. [25] N. Miura, et al., “An 8Tb/s 1pJ/b 0.8mm2/Tb/s QDR Inductive-Coupling Interface Between 65nm CMOS and 0.1um DRAM,” ISSCC, pp.436-437, Feb. 2010. [26] M. Saito, et al., “A 2Gb/s 1.8pJ/b/chip Inductive-Coupling Through-Chip Bus for 128-Die NAND-Flash Memory Stacking,” ISSCC, pp.440-441, Feb. 2010. [27] T. Kuroda, “Inductively Coupled ThruChip Interface,” ISSCC, ES3(Energy-Efficient High-Speed Interfaces), Feb. 2010. [28] N. Miura, et al., “A 0.7V 20fJ/bit Inductive-Coupling Data Link with Dual-Coil Transmission Scheme,” Symposium on VLSI Circuits, pp. 201-202, June 2010. [29] T. Kuroda, et al., “ThruChip Interface (TCI) for 3D Integration of Low-Power System (Invited),” IEDM, p.17.1.1, Dec. 2010. [30] N. Miura, et al., “A 2.7Gb/s/mm2 0.9pJ/b/Chip 1Coil/Channel ThruChip Interface for NAND Flash Memory Stacking,” ISSCC, pp.490-491, Feb. 2011. [31] Y. Shimazaki, et al., “A 5Gbps/ch ThruChip Interface and Autom. P&R Design Methodology for 3-D Integration of 45nm CMOS Processors,” COOL Chips XV, pp.1-3, Apr. 2012. [32] Y. Koizumi, et al., “Dynamic power control with a heterogeneous multi-core system using a 3-D wireless inductive coupling interconnect,” ICFPT'12, pp. 293-296, Dec. 2012. [33] H. Matsutani, et al., “A Case for Wireless 3D NoCs for CMPs ,” ASP-DAC'13, pp. 23-28, Jan. 2013. [34] Y. Take, et al., “3D Clock Distribution Using Vertically/Horizontally Coupled Resonators ,” ISSCC, pp. 258-259, Feb. 2013. [35] “Introduction of Gettering DP Wheel”, DISCO Website, in both English and Japanese, http://www.disco.co.jp/jp/solution/apexp/polisher/gettering.html [36] Y.S. Kim, et al., “Ultra Thinning down to 4mm using 300-mm Wafer proven by 40-nm Node 2 Gb DRAM for 3D Multi-stack WOW Applications”, Symp. VLSI Circuits, pp. 22-23, June 2014. [37] A.R. Junaidi, Y. Take, T. Kuroda, “A 352 Gb/s Inductive-Coupling DRAM/SoC Interfaces Using Overlapping Coils with Phase Division Multiplexing and Ultra-Thin Fan-Out Wafer Level Package”, Symp. VLSI Circuits, June 2014. [38] Y. Take, N. Miura, T. Kuroda, “A 30 Gb/s/Link 2.2 Tb/s/mm2 Inductively-Coupled Injection-Locking CDR for High-Speed DRAM Interface”, JSSC, pp 2552-2559, November 2011. [39] N. Miura, e al., A 0.55V 10fJ/bit Inductive-Coupling Data Link and 0.7V 135fJ/Cycle Clock Link with Dual-Coil Transmission Scheme”, IEEE JSSC, pp. 965-973, April 2011.

References