multiplying moore s law with proximity communication
play

Multiplying Moore's Law with Proximity Communication Robert Drost, - PowerPoint PPT Presentation

Multiplying Moore's Law with Proximity Communication Robert Drost, Ph.D. Director and Distinguished Engineer Sun Microsystems Laboratories Outline The Bandwidth Motivation Proximity Communication Technology Multiplying Moore's Law


  1. Multiplying Moore's Law with Proximity Communication Robert Drost, Ph.D. Director and Distinguished Engineer Sun Microsystems Laboratories

  2. Outline • The Bandwidth Motivation • Proximity Communication Technology • Multiplying Moore's Law 2

  3. The Team VLSI Research Group at Sun Labs Igor Benko, Alex Chow, Wes Clark, Bill Coates, Robert Drost, Jo Ebergen, Scott Fairbanks, Jonathan Gainsley, Gilda Garreton, Yaeko Hirotsuka, Ron Ho, David Hopkins, Ian Jones, Russell Kao, Jon Lexau, Dimitri Nadezhin, Tarik Ono, Steve Rubin, Jeff Rulifson, Justin Schauer, Ivan Sutherland, and friends: David Harris, Mark Greenstreet, Ken Yang And many others at Sun 3

  4. Why do we want more off-chip bandwidth anyway? 4

  5. Motivation: CPU vs. DRAM J.L. Hennessy and D.A. Patterson, Computer Organization and Design, 2nd ed. 5

  6. Motivation: BBW vs. Flops 3,000 0.01 byte/flop 0.1 byte/flop 0.001 byte/flop 1 byte/flop 1,000 (Ref 1) Performance (TFlops/sec) More bandwidth/flop Blue Gene/L (2005) 10 bytes/flop ¼ Blue SX-8 Vector Gene/L NASA 100 Columbia MPP Sandia Red Storm Thin-node Colsa Cluster Mach5 Earth ASCI-Q LLNL Sim Thunder Fat-node Cluster NCSA Tungsten 10 0.01 0.1 1 10 100 1,000 2,000 Bisection Bandwidth (TBytes/sec) 6

  7. Bandwidth versus Memory Capacity (Ref 1) 7

  8. Motivation: Lack of Data Locality Dense Linear Algebra 3D FFT (Ref 2) Black=no processor pair communication White=Heavy processor pair communication 8

  9. Proximity Communication Tech nology 9

  10. Proximity Communication ● Avoids Off-Chip Wires Chip2 ● Increases Bandwidth/Area Chip1 Chip3 ● Makes Chips Replaceable Transmit Receive ● Enhances Testing Capability ● Enables Smaller Chips ● Obviates ESD Protection Receive Transmit ● Shrinks Transceiver Circuits 1000 Proximity Area Ball Bonding Proximity Communication I/O Area Ball 100 Bonding 1 2 0 1 u 5 m u m 10 2003 2005 2007 2009 Year 10

  11. Simple Circuits: 11

  12. Proximity Packaging Challenges • Performance is a function of Z, Ψ , Φ misalignments Power Connection Alignment Heat Force Vector Extraction • With reasonable misalignment control tens of Tbps bandwidth per chip can be realized 12

  13. Alignment is Multi-Dimensional θ X Y Φ ψ Z Chip1 Chip2 13

  14. Alignment is the major challenge • Must align chips in X, Y, Θ , Z, Ψ , Φ • X, Y, Θ misalignments are corrected electronically Chip1 Chip2 0 0 Inactive Tx micropad Tx Micropads 1 1 Active Tx micropad 0 Rx pad X 1 0 0 Y Vernier 1 1 Tstrobe Rstrobe X Vernier Rx Pads ...and correct... Measure... ...on-chip 14

  15. Steering Circuit One Receiver Pad Pitch B1 B2 Steering Tx pad in two dimensions C1 C2 15

  16. Pads Cross-Section Transmitter Plates Plate Chip 1 50 μ m Separation Chip 2 Receiver Plates 16

  17. Signal and Noise Simulated Coupling Combining estimates for ● Channel speed ● Receiver sensitivity ● Signal vs. noise for pads ● Clocking and overhead We can estimate... Where G=pad separation, or gap in microns 17

  18. A tileable PxC block Data Tx channels Align Align Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Clock channel Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Data Align Align channels 18

  19. Measured results • TSMC 180nm CMOS • 72 transmit, 72 receive channels • 1.8 Gb/s per channel, 10 -15 bit error rate • Aggregate 260Gb/s/chip, density 430Gb/s/mm 2 • 3pJ/bit 19

  20. Experimental Setup PCB1 PCB1 PCB1 PCB2 PCB2 PCB2 Chip1 Chip2 Chip1 Chip2 20

  21. BER vs. chip separation 21

  22. Eye opening at 1.8Gb/s 22

  23. How do we multiply Moore's Law? 23

  24. The Key Idea in Moore's Law • Double number of transistors/chip (for same cost) every 24 months > The principal driving force behind the past 40 years of integrated circuit industry advancement > An amazing prediction in 1965 based on fewer than a hundred transistors/chip 24

  25. The Key Idea in Proximity Comm. • We connect chips with enough bandwidth that they can perform as a single integrated chip • Hence, PxC increases the effective number of transistors/chip over and above Moore's Law 25

  26. Multiplying Moore's Law • Assuming Moore's Law continues PxC Arrays with increasing Transistors per Chip chip counts 1,000,000,000 g n i l a c s w a 1,000,000 L s ' e r o o M 1,000 1970 1980 1990 2000 2010 2020 26

  27. What if Moore's Law stalls? • Many have (incorrectly) predicted demise of Moore's Law • Technical causes > Short channel effects in transistors leading to too much leakage and hence power consumption > Wire delay limiting performance • Financial causes > Fabs cost too much to yield a return on investment • 65nm fabs cost $3 Billion to build (and going up 2x per generation) > Chips cost too much to yield a return on investment 27

  28. Multiplying a stalled Moore's Law • Proximity Communication keeps increasing transistors/chip without a fabrication contribution s y a g r Transistors r n A i s s C a per Chip t n e x u P r c o n c i p h i t h i w c 1,000,000,000 If Moore's Law stalls g n i l a c s w a 1,000,000 L s ' e r o o M 1,000 1970 1980 1990 2000 2010 2020 28

  29. Summary • Need for off-chip bandwidth motivates PxC • Good mechanical alignment enables PxC and its tremendous bandwidth increase • PxC multiplies Moore's Law by providing enough bandwidth to realize wafer-scale integration 29

  30. Multiplying Moore's Law with Proximity Communication http://research.sun.com/vlsi

  31. References (1) D. Hopkins, et al., “Circuit Techniques to Enable 430Gb/s/mm 2 Proximity Communication,” IEEE Int'l Solid-State Circuits Conference , Feb. 2007. (2) R. Drost, et al., “Challenges in building a flat-bandwidth memory hierarchy for a large-scale computer with proximity communication,” High Performance Interconnects, 2005. Proceedings. 13th Symposium on , pp. 13-22, Aug. 2005. (3) Krste Asanovíc, et al., “The Landscape of Parallel Computing Research: A View from Berkeley,” EECS Technical Report, in press , December 3, 2006. (4) R. Drost, R. Ho, R. D. Hopkins, I. Sutherland, “Electronic Alignment for Proximity Communication,” IEEE Int'l Solid-State Circuits Conference , Feb. 2004. (5) R. Drost, R. D. Hopkins, I. Sutherland, “Proximity Communications,” IEEE Custom Integrated Circuits Conference , pp. 469-472, Sept. 2003. (6) J.L. Hennessy and D.A. Patterson, Computer Organization and Design, 2nd ed., Morgan Kaufmann Publishers, San Francisco, 1997. 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend