Adversarial event generator tuning with Bayesian Optimization Maxim - - PowerPoint PPT Presentation

adversarial event generator tuning with bayesian
SMART_READER_LITE
LIVE PREVIEW

Adversarial event generator tuning with Bayesian Optimization Maxim - - PowerPoint PPT Presentation

Adversarial event generator tuning with Bayesian Optimization Maxim Borisyak, Andrey Ustyuzhanin National Research University Higher School of Economics (HSE) July 7, 2018 Event Generator Tuning Intro We consider problem of tuning parameters


slide-1
SLIDE 1

Adversarial event generator tuning with Bayesian Optimization

Maxim Borisyak, Andrey Ustyuzhanin National Research University Higher School of Economics (HSE)

July 7, 2018

slide-2
SLIDE 2

Event Generator Tuning

slide-3
SLIDE 3

Intro

We consider problem of tuning parameters of event generators to ’real’ data:

  • generating samples is expensive;
  • generator is non-differentiable.

Working example: Pythia 8 generator.

2

slide-4
SLIDE 4

Approach I

  • two histogram for each parameter: datai and MCi;
  • Bayesian Optimization on the objective:

χ2 =

nbins

i=1

(datai − MCi)2 σ2

data,i + σ2 MC,i

  • additional assumptions on distributions are required to guarantee

convergence;

3

slide-5
SLIDE 5

Approach II

  • an adversarial objective:

Wasserstein(Freal, Fθ) = sup

d∈L1

E

x∼Freal d(x) −

E

x∼Fθ d(x)

  • Variational Optimization to search for distribution over generator

parameters.

4

slide-6
SLIDE 6

Assumptions and goals

We consider Adversarial Bayesian Optimization:

  • no additional restrictions on distribution shapes;

Our primary concern is time complexity:

  • sampling from the target event generator is expensive;
  • number of generator calls dominates overall complexity;
  • minimizing number of event generator calls;
  • there is a configuration of generator that perfectly matches ’real’ data.

5

slide-7
SLIDE 7

Adversarial Bayesian Optimization

slide-8
SLIDE 8

Adversarial Objective

Jensen-Shannon distance: JS(P, Q) = log 2 + 1 2 [ E

x∼P log

P(x) P(x) + Q(x) + E

x∼Q log

Q(x) P(x) + Q(x) ] = log 2 − min

f

cross-entropy(f, P, Q)

  • Jensen-Shannon distance can be approximated by a classifier.

6

slide-9
SLIDE 9

Multi-Stage Adversarial Bayesian Optimization

  • sequence of classifier models with increasing power:

F1 ⊆ F2 ⊆ · · · ⊆ Fm = F

  • classifier Fi associated with ’pseudo’ JS distance:

pJSi(P, Q) = log 2 − min

f∈Fi cross-entropy(f, P, Q)

pJS1(P, Q) ≤ pJS2(P, Q) ≤ · · · ≤ pJSm(P, Q) = JS(P, Q);

pJSi(P, Q) ≥ 0 = ⇒ pJSi+1(P, Q) ≥ 0

7

slide-10
SLIDE 10

Multi-Stage Adversarial Bayesian Optimization

pJSi(P, Q) ≥ 0 = ⇒ pJSi+1(P, Q) ≥ 0

  • ’weak’ classifiers tend to require less samples;
  • ’weak’ classifiers can be used to rapidly explore search space;
  • these results are constraints for a more powerful classifier.

8

slide-11
SLIDE 11

Multi-Stage Adversarial Bayesian Optimization

1: model1 = unconstrained BO on pJS1(data, generatorθ) 2: for k = 2, . . . , m do 3:

constraintk(θ) = P ( pJSk−1 ≤ 0 | θ, modelk−1 )

4:

modelk = BO on pJSk(data, ·) s.t. constraintj(theta) > τ, j = 0, . . . , k − 1

5: end for

9

slide-12
SLIDE 12

Experiments

slide-13
SLIDE 13

Experiment

We follow problem statement from Ilten P, Williams M, Yang Y. Event generator tuning using Bayesian optimization. Journal of Instrumentation. 2017 Apr 27;12(04):P04028.

  • e+e− modeled by Pythia 8;
  • values of Monash tune as parameters of the ’real’ distribution;
  • 2-stage Adversarial Bayesian Optimization;
  • number of samples required to avoid overfitting of the classifier is measured.

10

slide-14
SLIDE 14

Experiment 1

Target generator options:

  • alphaSvalue.

11

slide-15
SLIDE 15

Experiment 1: stage 1

12

slide-16
SLIDE 16

Experiment 1: stage 1

13

slide-17
SLIDE 17

Experiment 1: stage 2

14

slide-18
SLIDE 18

Experiment 1: stage 2

15

slide-19
SLIDE 19

Experiment 1: single stage

16

slide-20
SLIDE 20

Experiment 1: results

17

slide-21
SLIDE 21

Experiment 2

Target generator options:

  • bLund;
  • sigma;
  • aExtraSQuark;
  • aExtraDiQuark;
  • rFactC;
  • rFactB.

Second group of varables from Ilten P, Williams M, Yang Y. Event generator tuning using Bayesian

  • ptimization. Journal of Instrumentation. 2017 Apr 27;12(04):P04028.

18

slide-22
SLIDE 22

Experiment 2: results

19

slide-23
SLIDE 23

Summary

slide-24
SLIDE 24

Summary

  • Adversarial Bayesian Optimization is a promising tool for tuning event

generators;

  • Multi-stage Adversarial Bayesian Optimization utilizes ’weak’ classifiers to

incrementally constrain search space:

  • rapid exploration of search space on first stages;
  • late stages search for solution only among promising candidates;
  • reduction in overall cost of optimization.

20

slide-25
SLIDE 25

Backup

20

slide-26
SLIDE 26

Bayesian Adversarial Optimization

1: initialize Bayesian Optimization 2: while not bored do 3:

θ ← askBO()

4:

train, Xθ test ← sample(θ)

5:

f ← train discriminator on Xθ

train and Xreal train

6:

L ←

1 2·m

[∑m

i=1 log f(Xθ,i test) + ∑m i=1 log(1 − f(Xreal,i test ))

]

7:

tellBO(θ, log 2 − L)

8: end while

21

slide-27
SLIDE 27

Possible Caveats

  • constraints are observed by authors to mess with GP;
  • without assumption ∃θ : JS(generator(θ), real) = 0:
  • it is likely that the method would still work (modifying constraints) if classifiers

are from the same family of algorithms;

  • it is possible, that BO with weak classifier carries no information about BO with

a strong classifier.

22

slide-28
SLIDE 28

Expected Improvement with Constraints

Problem: EI(x) → min; s.t. g(x) ≥ 0.

  • improvement is impossible if constraints are violated:

CEI(x) = P(g(x) ≥ 0) · EI(x) + P(g(x) < 0) · 0

  • constraints in our case: modeli(x) ≤ 0.

Gelbart, M.A., Snoek, J. and Adams, R.P., 2014. Bayesian optimization with unknown constraints. arXiv preprint arXiv:1403.5607.

23

slide-29
SLIDE 29

Technical details

  • training set is incrementally extended until over-fitting becomes

insignificant.

  • 2 stage ABO:
  • 1 stage: XGboost with 1 tree and max depth = 3;
  • 2 stage: XGboost with 20 tree and max depth = 6.

24

slide-30
SLIDE 30

Experiment 1

25