Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro - PowerPoint PPT Presentation

Using Today’s Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brück, Mathieu Luisier | |

Overview  What we want to do  How we do it | | Mauro Calderara Apr 08 2016 2

Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 3

Probably you’re familiar with this | Apr 08 2016 | Mauro Calderara 4

Zooming in | Apr 08 2016 | Mauro Calderara 5

The future? (link to video: http://iis.ee.ethz.ch/~mauro/movie_SC15.avi) | Apr 08 2016 | Mauro Calderara 6

From a somewhat more abstract POV Device | Apr 08 2016 | Mauro Calderara 7

From a somewhat more abstract POV ? e Device | Apr 08 2016 | Mauro Calderara 7

From a somewhat more abstract POV ? e e Device | Apr 08 2016 | Mauro Calderara 7

From a somewhat more abstract POV ? e e e Device | Apr 08 2016 | Mauro Calderara 7

From a somewhat more abstract POV ? e e e Device e e e | Apr 08 2016 | Mauro Calderara 7

This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? Device | | Mauro Calderara Apr 08 2016 8

This is what we’re ultimately interested in!  How do electrons behave w.r.t the device?  Change in parameters → change in Device behavior? | | Mauro Calderara Apr 08 2016 8

This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e Dimensions | | Mauro Calderara Apr 08 2016 8

This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e  Applies not just to transistors Dimensions  Batteries  Storage devices  ... | | Mauro Calderara Apr 08 2016 8

How would we do that? The ‘‘easy’’ case: | Apr 08 2016 | Mauro Calderara 9

How would we do that? The ‘‘easy’’ case: → device behaves like bulk material | Apr 08 2016 | Mauro Calderara 9

How would we do that? The ‘‘difficult’’ case: | Apr 08 2016 | Mauro Calderara 10

How would we do that? The ‘‘difficult’’ case: → device behaves like atomic structure | Apr 08 2016 | Mauro Calderara 10

The cost of going small Why is this ‘‘easy’’ ... ... and this ‘‘difficult’’? | Apr 08 2016 | Mauro Calderara 11

The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

The cost of going small runtime runtime Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 14

Where does all that time go? runtime ~ 40x | Apr 08 2016 | Mauro Calderara 15

Where does all that time go? runtime ~ 40x Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

Where does all that time go? runtime ~ 40x Invert the matrix from before (selectively!) using a recursive algorithm. Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x | | Mauro Calderara Apr 08 2016 16

Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems | | Mauro Calderara Apr 08 2016 16

Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems  Cost: code now mem-bw bound And: not such a good fit for GPUs ...  | | Mauro Calderara Apr 08 2016 16

Tackling the eigenvalue problem runtime runtime ~ 200x  We’ve been able to solve that one  | | Mauro Calderara Apr 08 2016 17

Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) | | Mauro Calderara Apr 08 2016 18

Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But | | Mauro Calderara Apr 08 2016 18

Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) Advisor PhD student ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible -1 = | | Mauro Calderara Apr 08 2016 19

A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1 = | | Mauro Calderara Apr 08 2016 19

A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only = | | Mauro Calderara Apr 08 2016 19

A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only =  If we can compute this fast, we can  interleave the solving step with the BC computation  obtain the full solution very efficiently | | Mauro Calderara Apr 08 2016 19

Obtaining the first and last block columns of the inverse  Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro - PowerPoint PPT Presentation

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brck, Mathieu Luisier | | Overview What we want to do How we do it | | Mauro Calderara Apr 08 2016 2 Overview What we want to do

Cool Chips Cool Chips Markets Markets Cool Cargo Applications Cool Cargo Applications

Cool Chips Cool Chips Markets Markets Aerospace Applications Aerospace Applications

Cool Chips Cool Chips Markets Markets Domestic Refrigeration Domestic Refrigeration

Cool Chips Cool Chips Markets Markets Semiconductor Fabrication Semiconductor

Cool Chips Cool Chips Markets Markets Electronics Cooling Electronics Cooling Cool

INTERFACING WITH OTHER CHIPS Examples of three LED driver chips Why Add Other Chips? Lots

Interfacing with other chips Examples of three LED driver chips Why Add Other Chips? Lots of

AI for AI Systems and Chips Azalia Mirhoseini Senior Research Scientist, Google Brain In the

Building Buggy Chips - That Work! Building Buggy Chips - That Work! Todd Austin Advanced

Building Chips Chips are made of silicon Aka sand The most adundant element in

1 What is it Really? ARM Chips ARM Chips ARM Chips ARM Chips Typically an Embedded

Las Vegas Fire & Rescue Nevada 2-1-1 Emergency 9-1-1 Nurse Call Line CHIPs + CHIPs and

The DipMate is a very convenient fix for a problem that has troubled anyone who has used chips and

What is a flexible stone? 1 m 1 cm 1. Marble chips. The increased Natural marble chips from

NCATS Advisory Council June 2015 Concept Clearance TISSUES-ON-CHIPS PART II DANILO A. TAGLE,

A. Washing ships B. Watching ships C. Washing chips D. Watching chips A B Students A student

Scenarios for heating and cooling demand in the European residential sector until 2030 Sebastian

Icicle Workgroup Presentation Eight-Mile Lake Appraisal Study December 6, 2013 Mike Kaputa

Science and Research in the 21st Century and its Success in Technology Peter Schuster Institut

Coherent THz radiation at NewSUBARU NewSUBARU, LASTI, Y. Shoji University of Hyogo

Harvey Jones Channel Improvements Project Non-Mandatory Pre-Bid Conference April 14, 2014 1

Iterative Regularizing Ensemble Kalman Methods for Inverse Problems Marco Iglesias School of

Keep Alaska Moving through service and infrastructure Water and Roads Do not MIX!!!!!!!

Ion-Electron Coincidence Study of Thiophenone Rebecca Fitzgarrald August 2 nd , 2019 1 Goal:

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro - PowerPoint PPT Presentation

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brck, Mathieu Luisier | | Overview What we want to do How we do it | | Mauro Calderara Apr 08 2016 2 Overview What we want to do

Cool Chips Cool Chips Markets Markets Cool Cargo Applications Cool Cargo Applications

Cool Chips Cool Chips Markets Markets Aerospace Applications Aerospace Applications

Cool Chips Cool Chips Markets Markets Domestic Refrigeration Domestic Refrigeration

Cool Chips Cool Chips Markets Markets Semiconductor Fabrication Semiconductor

Cool Chips Cool Chips Markets Markets Electronics Cooling Electronics Cooling Cool

INTERFACING WITH OTHER CHIPS Examples of three LED driver chips Why Add Other Chips? Lots

Interfacing with other chips Examples of three LED driver chips Why Add Other Chips? Lots of

AI for AI Systems and Chips Azalia Mirhoseini Senior Research Scientist, Google Brain In the

Building Buggy Chips - That Work! Building Buggy Chips - That Work! Todd Austin Advanced

Building Chips Chips are made of silicon Aka sand The most adundant element in

1 What is it Really? ARM Chips ARM Chips ARM Chips ARM Chips Typically an Embedded

Las Vegas Fire &amp; Rescue Nevada 2-1-1 Emergency 9-1-1 Nurse Call Line CHIPs + CHIPs and

The DipMate is a very convenient fix for a problem that has troubled anyone who has used chips and

What is a flexible stone? 1 m 1 cm 1. Marble chips. The increased Natural marble chips from

NCATS Advisory Council June 2015 Concept Clearance TISSUES-ON-CHIPS PART II DANILO A. TAGLE,

A. Washing ships B. Watching ships C. Washing chips D. Watching chips A B Students A student

Scenarios for heating and cooling demand in the European residential sector until 2030 Sebastian

Icicle Workgroup Presentation Eight-Mile Lake Appraisal Study December 6, 2013 Mike Kaputa

Science and Research in the 21st Century and its Success in Technology Peter Schuster Institut

Coherent THz radiation at NewSUBARU NewSUBARU, LASTI, Y. Shoji University of Hyogo

Harvey Jones Channel Improvements Project Non-Mandatory Pre-Bid Conference April 14, 2014 1

Iterative Regularizing Ensemble Kalman Methods for Inverse Problems Marco Iglesias School of

Keep Alaska Moving through service and infrastructure Water and Roads Do not MIX!!!!!!!

Ion-Electron Coincidence Study of Thiophenone Rebecca Fitzgarrald August 2 nd , 2019 1 Goal:

Las Vegas Fire & Rescue Nevada 2-1-1 Emergency 9-1-1 Nurse Call Line CHIPs + CHIPs and