using today s fastest chips to design the chips of
play

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro - PowerPoint PPT Presentation

Using Todays Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brck, Mathieu Luisier | | Overview What we want to do How we do it | | Mauro Calderara Apr 08 2016 2 Overview What we want to do


  1. Using Today’s Fastest Chips to Design the Chips of Tomorrow Mauro Calderara, Sascha Brück, Mathieu Luisier | |

  2. Overview  What we want to do  How we do it | | Mauro Calderara Apr 08 2016 2

  3. Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 3

  4. Probably you’re familiar with this | Apr 08 2016 | Mauro Calderara 4

  5. Zooming in | Apr 08 2016 | Mauro Calderara 5

  6. The future? (link to video: http://iis.ee.ethz.ch/~mauro/movie_SC15.avi) | Apr 08 2016 | Mauro Calderara 6

  7. From a somewhat more abstract POV Device | Apr 08 2016 | Mauro Calderara 7

  8. From a somewhat more abstract POV ? e Device | Apr 08 2016 | Mauro Calderara 7

  9. From a somewhat more abstract POV ? e e Device | Apr 08 2016 | Mauro Calderara 7

  10. From a somewhat more abstract POV ? e e e Device | Apr 08 2016 | Mauro Calderara 7

  11. From a somewhat more abstract POV ? e e e Device e e e | Apr 08 2016 | Mauro Calderara 7

  12. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? Device | | Mauro Calderara Apr 08 2016 8

  13. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device?  Change in parameters → change in Device behavior? | | Mauro Calderara Apr 08 2016 8

  14. This is what we’re ultimately interested in!  How do electrons behave w.r.t the device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

  15. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage device? e e  Change in parameters → change in e Device behavior? e e e | | Mauro Calderara Apr 08 2016 8

  16. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e Dimensions | | Mauro Calderara Apr 08 2016 8

  17. This is what we’re ultimately interested in!  How do electrons behave w.r.t the Gate voltage Material device? properties e e  Change in parameters → change in e Device behavior? e e e  Applies not just to transistors Dimensions  Batteries  Storage devices  ... | | Mauro Calderara Apr 08 2016 8

  18. How would we do that? The ‘‘easy’’ case: | Apr 08 2016 | Mauro Calderara 9

  19. How would we do that? The ‘‘easy’’ case: → device behaves like bulk material | Apr 08 2016 | Mauro Calderara 9

  20. How would we do that? The ‘‘difficult’’ case: | Apr 08 2016 | Mauro Calderara 10

  21. How would we do that? The ‘‘difficult’’ case: → device behaves like atomic structure | Apr 08 2016 | Mauro Calderara 10

  22. The cost of going small Why is this ‘‘easy’’ ... ... and this ‘‘difficult’’? | Apr 08 2016 | Mauro Calderara 11

  23. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  24. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  25. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  26. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  27. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  28. The cost of going small Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  29. The cost of going small runtime runtime Can assume is ‘‘infinite’’ and Very finite! Need to do it from first principles. use semi empirical model. | Apr 08 2016 | Mauro Calderara 12

  30. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  31. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  32. The cost of going small runtime runtime Semi- empirical → O(Hours) First principles → O(Months) | | Mauro Calderara Apr 08 2016 13

  33. Overview  What we want to do → Quantum Transport: electrons and structures  How we do it → How GPUs saved the day | | Mauro Calderara Apr 08 2016 14

  34. Where does all that time go? runtime ~ 40x | Apr 08 2016 | Mauro Calderara 15

  35. Where does all that time go? runtime ~ 40x Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

  36. Where does all that time go? runtime ~ 40x Invert the matrix from before (selectively!) using a recursive algorithm. Solve an eigenvalue problem (not discussed here). | Apr 08 2016 | Mauro Calderara 15

  37. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x | | Mauro Calderara Apr 08 2016 16

  38. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems | | Mauro Calderara Apr 08 2016 16

  39. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems  Cost: code now mem-bw bound And: not such a good fit for GPUs ...  | | Mauro Calderara Apr 08 2016 16

  40. Avoiding the inversion, use a sparse solver instead  Instead of trying to invert selectively, runtime solve system using generic sparse solver package ~ 40x  Gain: speed, parallelism, capacity for somewhat larger systems  Cost: code now mem-bw bound And: not such a good fit for GPUs ...  | | Mauro Calderara Apr 08 2016 16

  41. Tackling the eigenvalue problem runtime runtime ~ 200x  We’ve been able to solve that one  | | Mauro Calderara Apr 08 2016 17

  42. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) | | Mauro Calderara Apr 08 2016 18

  43. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But | | Mauro Calderara Apr 08 2016 18

  44. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  45. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  46. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...)  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  47. Now what? runtime  Good speedup so far ~ 70x overall (now: O(Days), still not quite there...) Advisor PhD student ?  But Mem-BW bound by sparse solver | | Mauro Calderara Apr 08 2016 18

  48. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible -1 = | | Mauro Calderara Apr 08 2016 19

  49. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1 = | | Mauro Calderara Apr 08 2016 19

  50. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only = | | Mauro Calderara Apr 08 2016 19

  51. A Sparse Solver for Transport Problems running on GPUs  Inverting sparse system not feasible  In our case: also not neccessary -1  Need first and last block rows only =  If we can compute this fast, we can  interleave the solving step with the BC computation  obtain the full solution very efficiently | | Mauro Calderara Apr 08 2016 19

  52. Obtaining the first and last block columns of the inverse  Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20

  53. Obtaining the first and last block columns of the inverse  Recursive algorithm based on the for i = N:1 𝑌 𝑗 ← (𝐵 𝑗,𝑗 − 𝐵 𝑗,𝑗+1 𝑌 𝑗+1 ) \ 𝐵 𝑗,𝑗−1 Schwinger-Dyson equation for i = 2:N 𝑅 𝑗 ← −𝑌 𝑗 𝑅 𝑗−1 | | Mauro Calderara Apr 08 2016 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend