Computation and inversion of the dielectric matrix Derek - - PowerPoint PPT Presentation

computation and inversion of the dielectric matrix
SMART_READER_LITE
LIVE PREVIEW

Computation and inversion of the dielectric matrix Derek - - PowerPoint PPT Presentation

Computation and inversion of the dielectric matrix Derek Vigil-Fowler UC-Berkeley and LBNL BW Symposium 05/12/15 Email - vigil@berkeley.edu Materials Science for Energy, Technology Materials Science for Energy, Technology Dielectric response


slide-1
SLIDE 1

Computation and inversion of the dielectric matrix

Derek Vigil-Fowler UC-Berkeley and LBNL BW Symposium 05/12/15

Email - vigil@berkeley.edu

slide-2
SLIDE 2

Materials Science for Energy, Technology

slide-3
SLIDE 3

Materials Science for Energy, Technology

slide-4
SLIDE 4

Dielectric response :

slide-5
SLIDE 5

Dielectric response : E&M

slide-6
SLIDE 6

Dielectric response : E&M

slide-7
SLIDE 7

Dielectric response : quantum mechanics

slide-8
SLIDE 8

Dielectric response : quantum mechanics

slide-9
SLIDE 9

Dielectric response : quantum mechanics

slide-10
SLIDE 10

Pictorially

slide-11
SLIDE 11

Pictorially

slide-12
SLIDE 12

How to do one big matrix multiplication + inversion?

slide-13
SLIDE 13

How to do one big matrix multiplication + inversion?

Parallelism!

slide-14
SLIDE 14

How to do one big matrix multiplication + inversion?

BLAS + ScaLAPACK + MPI/OpenMP

slide-15
SLIDE 15

Distributed matrix multiplication

slide-16
SLIDE 16

Distributed matrix multiplication

slide-17
SLIDE 17

Problem with scheme : many frequencies done serially

  • Lots of communication and array assignments
  • All processors work on 1 frequency

– But ScaLAPACK doesn’t scale past ~ few hundred processors! – Smaller problems : can’t utilize ScaLAPACK enough

→ Wasted processors

slide-18
SLIDE 18

Solution : do many frequencies in parallel!

slide-19
SLIDE 19

Solution : do many frequencies in parallel!

slide-20
SLIDE 20

Solution : do many frequencies in parallel!

slide-21
SLIDE 21

Results

slide-22
SLIDE 22

Results

Bulk Si with 288 proc CO with 144 proc nfreq_par 1 2 8 1 2 8 Matmul total 13.12 8.934 4.395 9.31 6.89 2.13 Matmul prep 10.75 7.08 3.23 1.27 1.01 0.66 Matmul dgemm 2.17 1.75 1.135 1.85 1.60 0.90 Matmul comm 0.2 0.104 0.027 6.18 4.27 0.57 Invert total 0.744 0.26 0.064 5.28 2.60 0.93

slide-23
SLIDE 23

Conclusions

  • Parallelizing over frequencies reduced communication,

array assignment, and saturates ScalaPACK : faster run- time.

  • Also, for big problems will allow scaling to higher

processors counts for the frequency-dependent inverse dielectric matrix, a quantity of wide interest

slide-24
SLIDE 24

Acknowledgments

  • Blue Waters Graduate Fellowship
  • Jack Deslippe – NERSC
  • Felipe Homrich da Jornada – UC-Berkeley

This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.

slide-25
SLIDE 25