Disconnected diagrams with multi-level integration Leonardo Giusti, - - PowerPoint PPT Presentation

disconnected diagrams with multi level integration
SMART_READER_LITE
LIVE PREVIEW

Disconnected diagrams with multi-level integration Leonardo Giusti, - - PowerPoint PPT Presentation

Disconnected diagrams with multi-level integration Leonardo Giusti, Tim Harris , Alessandro Nada, Stefan Schaefer Lattice 2018, 23.07.18 Outline 1 Motivation 2 Variance reduction for disconnected diagrams 3 Disconnected vector two-point function


slide-1
SLIDE 1

Disconnected diagrams with multi-level integration

Leonardo Giusti, Tim Harris, Alessandro Nada, Stefan Schaefer Lattice 2018, 23.07.18

slide-2
SLIDE 2

Outline

1 Motivation 2 Variance reduction for disconnected diagrams 3 Disconnected vector two-point function with multi-level

2 / 14 Variance reduction

slide-3
SLIDE 3

Motivation

Quark-line disconnected diagrams appear when we consider singlet fermion bilinears flavour singlet currents, e.g. HVP isosinglet channels in spectroscopy, e.g. f0 hadronic matrix elements, e.g. nucleon σ-term quark condensates These are usually evaluated with a noisy estimator e.g. Hutchinson trace

  • ...ψ(x)Γψ(x)
  • = −
  • ...Eη
  • η†(x)ΓD−1(x,y)η(y)
  • (1)

where η(x) are independent random white noise The variance, σ2

η, of the estimator

trΓD−1 ≈ 1

N

N

  • n=1

η†

n(x)ΓD−1(x,y)ηn(y)

(2) is determined by off-diagonal elements

σ2

η = 1

N

  • x,y

x=y

  • ΓD−1(x,y)
  • 2

+...

(3)

0Hutchinson, “A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines”; Bernardson, McCarty, and Thron, “Monte Carlo methods for estimating linear combinations of inverse matrix entries in lattice QCD”.

3 / 14 Variance reduction

slide-4
SLIDE 4

Variance reduction – numerical tests

For certain currents, we need to compute a difference, e.g. EM current

η†Γ

  • D−1

light −D−1 strange

  • η

(4) whose variance is suppressed due to the covariance between light and strange Investigate using CLS Nf = 2 O(a)-improved Wilson fermions E5 ensemble with n0 = 30 configurations and study the saturation of the variance with N for

pseudoscalar vector current

id a (fm) mPS (MeV) amq

κ

mMS(2GeV) (MeV) NGCR light 0.0658 450 0.0056 0.13625 32 25 strange 0.0175 0.135808 100 19

4 / 14 Variance reduction

slide-5
SLIDE 5

Variance reduction – mass differences

As expected variance is reduced in the differ- ence

  • Furthermore. . .

Similar to the ‘one-end’ trick for TM, we can use for light-strange difference D−1

l

−D−1

s

= D−1

l

Ds −Dl D−1

s

= (ms −ml)D−1

l

D−1

s

,

to write a new estimator for the trace

trΓ(D−1

l

−D−1

s

) ≈ (ms −ml)tr(ΓD−1

l

ηη†D−1

s

).

Note that sample-wise

η†Γ(D−1

l

−D−1

s

)η = (ms −ml)tr(ΓD−1

l

ηη†D−1

s

).

relevant for HVP ms −ml smaller than at physical point

101 102 103 104 105 100 101 102 103 variance number of quadratures pseudoscalar standard light standard light-strange

  • ne-end light-strange

5 / 14 Variance reduction

slide-6
SLIDE 6

Variance reduction – mass differences

As expected variance is reduced in the differ- ence

  • Furthermore. . .

Similar to the ‘one-end’ trick for TM, we can use for light-strange difference D−1

l

−D−1

s

= D−1

l

Ds −Dl D−1

s

= (ms −ml)D−1

l

D−1

s

,

to write a new estimator for the trace

trΓ(D−1

l

−D−1

s

) ≈ (ms −ml)tr(ΓD−1

l

ηη†D−1

s

).

Note that sample-wise

η†Γ(D−1

l

−D−1

s

)η = (ms −ml)tr(ΓD−1

l

ηη†D−1

s

).

relevant for HVP ms −ml smaller than at physical point

10−1 100 101 102 103 104 105 100 101 102 103 variance number of quadratures vector standard light standard light-strange

  • ne-end light-strange

5 / 14 Variance reduction

slide-7
SLIDE 7

Variance reduction – frequency splitting

Perform a frequency splitting of the estimator to separate the UV and IR

trΓD−1 = trΓ(D−1 −D−1

1 )+trΓ(D−1 1

−D−1

2 )+...+trΓD−1 n

(5) where we compute the differences with noisy estimator for the ‘one-end’ trick

trΓ(D−1

i

−D−1

j

) ≈ (mj −mi)η†D−1

j

ΓD−1

i

η

(6) and use the hopping parameter expansion to order k for the largest mass

trΓD−1

n

≈ trΓ

k−1

  • m=0

B−mHmB−1

  • hopping

+ξ†ΓB−kHkD−1

n ξ

  • remainder

(7) where B = 4+m +σ·F contains the clover term

use hierarchical probing or spatial dilution to compute hopping contribution exactly

with finite quadratures

remainder computed using standard noisy estimator 0Hasenbusch, “Exploiting the hopping parameter expansion in the hybrid Monte Carlo simulation of lattice QCD with two degenerate flavors of Wilson fermions”. 0Stathopoulos, Laeuchli, and Orginos, “Hierarchical probing for estimating the trace of the matrix inverse on toroidal lattices”.

6 / 14 Variance reduction

slide-8
SLIDE 8

Variance reduction – frequency splitting

id amq

κ

mMS(2GeV) (MeV) NGCR m0 0.0056 0.13625 32 25 m1 0.0876 0.133272 500 11 m2 0.34 0.125 1914 5 Contributions depend on the intermediate masses, order of the hopping parameter ex- pansion and channel Can be tuned with just a few noise samples Hierarchical probing computes the exact es- timator for the hopping Hk−1 after k4/8 quadratures Spatial dilution with even-odd blocks does the same, also for k = 2m

100 101 102 103 104 105 103 104 105 106 variance cost (core-s) pseudoscalar, k = 4 standard exact hopping light doublet heavy doublet hopping remainder

7 / 14 Variance reduction

slide-9
SLIDE 9

Variance reduction – frequency splitting

id amq

κ

mMS(2GeV) (MeV) NGCR m0 0.0056 0.13625 32 25 m1 0.0876 0.133272 500 11 m2 0.34 0.125 1914 5 Contributions depend on the intermediate masses, order of the hopping parameter ex- pansion and channel Can be tuned with just a few noise samples Hierarchical probing computes the exact es- timator for the hopping Hk−1 after k4/8 quadratures Spatial dilution with even-odd blocks does the same, also for k = 2m

100 101 102 103 104 105 103 104 105 106 variance cost (core-s) pseudoscalar, k = 8 standard exact hopping light doublet heavy doublet hopping remainder

7 / 14 Variance reduction

slide-10
SLIDE 10

Variance reduction – frequency splitting

id amq

κ

mMS(2GeV) (MeV) NGCR m0 0.0056 0.13625 32 25 m1 0.0876 0.133272 500 11 m2 0.34 0.125 1914 5 Contributions depend on the intermediate masses, order of the hopping parameter ex- pansion and channel Can be tuned with just a few noise samples Hierarchical probing computes the exact es- timator for the hopping Hk−1 after k4/8 quadratures Spatial dilution with even-odd blocks does the same, also for k = 2m

10−1 100 101 102 103 104 105 103 104 105 106 variance cost (core-s) vector, k = 4 standard exact hopping light doublet heavy doublet hopping remainder

7 / 14 Variance reduction

slide-11
SLIDE 11

Variance reduction – frequency splitting

id amq

κ

mMS(2GeV) (MeV) NGCR m0 0.0056 0.13625 32 25 m1 0.0876 0.133272 500 11 m2 0.34 0.125 1914 5 Contributions depend on the intermediate masses, order of the hopping parameter ex- pansion and channel Can be tuned with just a few noise samples Hierarchical probing computes the exact es- timator for the hopping Hk−1 after k4/8 quadratures Spatial dilution with even-odd blocks does the same, also for k = 2m

10−1 100 101 102 103 104 105 103 104 105 106 variance cost (core-s) vector, k = 8 standard exact hopping light doublet heavy doublet hopping remainder

7 / 14 Variance reduction

slide-12
SLIDE 12

Variance reduction – conclusions

Gain of ∼ 4 in the cost for the pseudoscalar, vector Similar to hierarchical probing for pseu- doscalar, faster for vector Parameters can be further optimized Expect efficient estimator at smaller quark mass Scalar and axial vector similar to pseu- doscalar and vector, respectively

102 103 104 105 102 103 104 105 106 107 variance cost (core-s) pseudoscalar, k = 4 standard hierarchical probing splitting estimator 102 103 104 105 102 103 104 105 106 107 variance cost (core-s) pseudoscalar, k = 8 standard hierarchical probing splitting estimator

8 / 14 Variance reduction

slide-13
SLIDE 13

Variance reduction – conclusions

Gain of ∼ 4 in the cost for the pseudoscalar, vector Similar to hierarchical probing for pseu- doscalar, faster for vector Parameters can be further optimized Expect efficient estimator at smaller quark mass Scalar and axial vector similar to pseu- doscalar and vector, respectively

100 101 102 103 104 105 102 103 104 105 106 107 variance cost (core-s) vector, k = 4 standard hierarchical probing splitting estimator 100 101 102 103 104 105 102 103 104 105 106 107 variance cost (core-s) vector, k = 8 standard hierarchical probing splitting estimator

8 / 14 Variance reduction

slide-14
SLIDE 14

Recap domain decomposition

As seen in talk by A. Nada, there exists a representa- tion

〈O0(x)O1(y)〉QCD =

  • O0(x)
  • O1(x)
  • 1WN
  • N

WN

  • N

where

  • O0(x)
  • 0 =
  • U∈Λ0

[dUΛ0]e−SG[UΛ0 ] detQΩ∗

0 O0(x)

A factorizable multi-level estimator has a variance

σ2 ∝

1 n0n2

1

To investigate multi-level scaling, our estimator’s vari- ance should be dominated by the gauge noise

Ω∗ Ω∗ 1 O1(x) O2(y) Λ0 Λ1 Λ2

n0 level-0 n1 level-1

0Cè, Giusti, and Schaefer, “A local factorization of the fermion determinant in lattice QCD”.

9 / 14 Variance reduction

slide-15
SLIDE 15

Disconnected two-point function with quenched multilevel

Use splitting estimator with hopping order k = 4 and hopping parameters

β

n0 n1 L/a T /a csw

κ

mPS (MeV) a (fm) 6.2 50 16 32 96 1.61375 0.1352 580 0.068 0.128 0.115 An unbiased estimator for the disconnected contraction is

x0 y0 = ×          ×          − − − (■ ↔ ■)          ×          − −          −         

first line is fully factorized multi-level remainder terms are computed on n0 ×n1 configurations

10 / 14 Variance reduction

slide-16
SLIDE 16

Multilevel error scaling: factorized contribution

×

variance scales as 1/n0n2

1

frozen region x0 −y0/a < 8

10−10 10−9 10−8 10−7 10−6 10 20 30 40 50 60 70 80 90 variance (x0 −y0)/a vector, factorized

  • ne-level, no fact. n1 = 1
  • ne-level, fact. n1 = 1

multi-level n1 = 16, fact.

  • ne-level/n2

1

11 / 14 Variance reduction

slide-17
SLIDE 17

Multilevel error scaling: remainder 1

         ×          −

variance scales as 1/n0n1

10−12 10−11 10−10 10−9 10−8 10−7 10 20 30 40 50 60 70 80 90 variance (x0 −y0)/a vector, remainder 1 n0 n0 ×n1

12 / 14 Variance reduction

slide-18
SLIDE 18

Error scaling: remainder 2

         ×          −          −         

variance is highly suppressed, scales as 1/n0n1 and

10−15 10−14 10−13 10−12 10−11 10−10 10−9 10 20 30 40 50 60 70 80 90 variance (x0 −y0)/a vector, remainder 2 n0 n0 ×n1

13 / 14 Variance reduction

slide-19
SLIDE 19

Conclusions

Variance reduction

‘one-end’ type trick for Wilson for

light-strange

frequency-splitting of loop to split UV

and IR Multi-level disconnected diagrams

  • bserved expected error scaling

need to investigate dependence of gain

  • n distance

Expected combination of gains from both vari- ance reduction and multi-level

10−15 10−14 10−13 10−12 10−11 10−10 10−9 10−8 10 20 30 40 50 60 70 80 90 variance (x0 −y0)/a vector no ML fact.

  • rem. 1
  • rem. 2

14 / 14 Variance reduction

slide-20
SLIDE 20

Backup

15 / 14 Variance reduction

slide-21
SLIDE 21

Variance reduction – light-strange two-point function

−5×10−5 5×10−5 0.0001 0.00015 0.0002 0.00025 0.0003 10 20 30 40 50 60 C2(x0 −y0) (x0 −y0)/a vector, N = 256 light-strange 16 / 14 Variance reduction

slide-22
SLIDE 22

Variance reduction – hopping parameter expansion

The hopping parameter expansion is based upon a polynomial approximation to D−1, D−1

sw = ((Doo +Dee)

  • (2κ)−1B

+(Deo +Doe)

  • (2κ)−1H

)−1 = 2κ(1−κB−1H)−1B−1 =

k−1

  • n=0

κnB−nHnB−1

  • cheap

+κkB−kHkD−1

sw

  • reduced variance

(8)

0Dong and K. F. Liu, “Stochastic Estimation with Z2 Noise”; Q. Liu, Wilcox, and Morgan, “Polynomial Subtraction Method for Disconnected Quark Loops”.

17 / 14 Variance reduction

slide-23
SLIDE 23

Variance reduction – hierarchical probing

Probing for a sparse matrix can compute a trace with fewer quadratures

   

1 7 9 4 4 9 7 9 1 4 9 4

   

probing with

   

1 1 1 1

   

(9) Hierachical probing for lattices chooses Hadamard vectors {hi ⊙η} allows nesting

contribution from d leading diagonals eliminated with 2d4 quadratures trHn exactly estimated with n4/8 quadratures

Both hierarchical probing and hopping parameter expansion work well for large masses

0Stathopoulos, Laeuchli, and Orginos, “Hierarchical probing for estimating the trace of the matrix inverse on toroidal lattices”.

18 / 14 Variance reduction

slide-24
SLIDE 24

References

  • M. Hutchinson. “A stochastic estimator of the trace of the influence matrix for

laplacian smoothing splines”. In: Communications in Statistics - Simulation and Computation 19.2 (1990), pp. 433–450. eprint: https://doi.org/10.1080/03610919008812866. url: https://doi.org/10.1080/03610919008812866.

  • S. Bernardson, P. McCarty, and C. Thron. “Monte Carlo methods for estimating

linear combinations of inverse matrix entries in lattice QCD”. In: Computer Physics Communications 78.3 (1994), pp. 256–264. issn: 0010-4655. url: http://www.sciencedirect.com/science/article/pii/0010465594900043.

  • M. Hasenbusch. “Exploiting the hopping parameter expansion in the hybrid Monte

Carlo simulation of lattice QCD with two degenerate flavors of Wilson fermions”. In:

  • Phys. Rev. D97.11 (2018), p. 114512. arXiv: 1805.03560 [hep-lat].
  • A. Stathopoulos, J. Laeuchli, and K. Orginos. “Hierarchical probing for estimating

the trace of the matrix inverse on toroidal lattices”. In: (2013). arXiv: 1302.4018 [hep-lat].

  • M. Cè, L. Giusti, and S. Schaefer. “A local factorization of the fermion determinant

in lattice QCD”. In: Phys. Rev. D95.3 (2017), p. 034503. arXiv: 1609.02419 [hep-lat].

  • S. J. Dong and K. F. Liu. “Stochastic Estimation with Z2 Noise”. In: Phys.Lett. B

328 (1994), pp. 130–136. eprint: hep-lat/9308015.

  • Q. Liu, W. Wilcox, and R. Morgan. “Polynomial Subtraction Method for

Disconnected Quark Loops”. In: (2014). arXiv: 1405.1763 [hep-lat].

19 / 14 Variance reduction