Some Comments on GD and IGD and Relations to the Hausdorff Distance - - PowerPoint PPT Presentation

some comments on gd and igd and relations to the
SMART_READER_LITE
LIVE PREVIEW

Some Comments on GD and IGD and Relations to the Hausdorff Distance - - PowerPoint PPT Presentation

1 Some Comments on GD and IGD and Relations to the Hausdorff Distance O. Schtze, X. Esquivel, A. Lara , C. Coello CINVESTAV-IPN Centro de Investigacin y de Estudios Avanzados del Instituto Politcnico Nacional. Mexico City, Mexico O.


slide-1
SLIDE 1

1

  • O. Schütze

Some Comments on GD and IGD and Relations to the Hausdorff Distance

  • O. Schütze, X. Esquivel, A. Lara, C. Coello

CINVESTAV-IPN Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional. Mexico City, Mexico

slide-2
SLIDE 2

2

  • O. Schütze

Outline

Introduction and Background

  • Trade off for the design of indicators for the evaluation
  • f MOEAs
  • Metric / Hausdorff distance

Investigation of the Indicators

  • GD
  • IGD

A ‘New’ Indicator

  • Metric properties
  • Extension to continuous models
slide-3
SLIDE 3

3

  • O. Schütze

Multi-Objective Optimization

⎪ ⎩ ⎪ ⎨ ⎧ → ⊂ → ⊂ = R R Q f R R Q f F

n k n

: : min

1

  • (MOP)

Multi-Objective Optimization Problem PQ = set of optimal solutions (Pareto set) F(PQ) = the image of PQ (Pareto front)

Pareto set

f2 f1

Pareto front

f1,f2 x First we consider discrete (or discretized) models, i.e., |Q|<∞.

slide-4
SLIDE 4

4

  • O. Schütze

Outliers in Stochastic Search Algorithms

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = → ) ( ) ( ] 1 , [ :

1

x g x x F R F

k n

Example: Consider the MOP where g:[0,1]nRk-1 (Okabe, ZDT). Assume a point x=(ε,z), z∈[0,1]n-1, is a member of the archive/population. Further, assume that new candidate solutions are chosen uniformly at random from the domain. Then the probability to find a point that dominates x is less than ε ( objective 1). The distance of x to PQ can be ‘large’.

(ε,x2)

slide-5
SLIDE 5

5

  • O. Schütze

Example

P hypothetical Pareto front X1 perfect approximation of P, except one outlier X2 none of the elements are ‘near’ to P Question: Which approxomation is ‘better’? Extreme situations:

  • - pessimistic view (Hausdorff distance): dH(X1,P)=9, dH(X2,P)=2.83
  • - averaged result (Generational distance): GD(X1,P)=0.81, GD(X2,P)=2.83
slide-6
SLIDE 6

6

  • O. Schütze

Outlier Trade Off

Use of a Metric + greedy search = shortest path to the set of interest ( triangle inequality)

  • - Penalization of single
  • utliers of the candidate set

Averaging the Results + Single outliers do not have a mayor influence on the result

  • - The greedy search is not

neccessarily the shortest path to the set of interest Trade off for the indicator D when measuring results of MOEAs (the design of MOEAs is influenced by D):

slide-7
SLIDE 7

7

  • O. Schütze

Metric

Definition: Suppose X is a set and d:X×XR is a function. Then d is called a metric on X if, and only if, for each a,b,c∈X:

) , ( ) , ( ) , ( ) ( ) , ( ) , ( ) ( ) , ( and ) , ( ) ( c a d b a d c a d c a b d b a d b b a b a d b a d a + ≤ = = ⇔ = ≥

(Positive Property) (Symmetric Property) (Triangle Inequality) Variants:

  • - d is called a semi-metric if properties (a) and (b) are satisfied
  • - A pseudo-metric is a semi-metric that satisfies the relaxed

triangle inequality:

1 )), , ( ) , ( ( ) , ( ≥ + ≤ σ σ c a d b a d c a d

slide-8
SLIDE 8

8

  • O. Schütze

Hausdorff Distance

Definition: Let u,v∈Rn, A,B⊂Rn, and ||.|| be a vector norm. The Hausdorff distance dH is defined as follows:

)) , ( ), , ( max( : ) , ( ) ( ) , ( sup : ) , ( ) ( inf : ) , ( ) ( A B dist B A dist B A d c A u dist A B dist b v u A u dist a

H B u A v

= = − =

∈ ∈

u A B A

Remarks: (i) dist(A,B) is not symmetric: if B is a proper subset of A, then it is dist(B,A)=0 and dist(A,B)>0. (ii) dH is a metric on the set of discrete sets. It can also be used for continuous spaces. In that case it is dH(A,B)=0 ⇔clos(A)=clos(B)

slide-9
SLIDE 9

9

  • O. Schütze

Discussion of GD (1)

GD as proposed by Van Veldhuizen applied on general finite sets X, Y⊂Rk using dist:

p X i p i Y

x dist X Y X GD

/ 1 | | 1

) , ( 1 ) , ( ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

=

Metric properties:

  • - positive property: NO

it is GD(X,Y)=0 ⇔ X⊂Y (X can be a proper subset of Y (*))

  • - symmetric property: NO

(*): then GD(X,Y)=0 but GD(Y,X)>0

  • - triangle inequality: NO (next slide)
slide-10
SLIDE 10

10

  • O. Schütze

Discussion of GD (2)

2.) Investigate (relaxed) triangle inequality: let X,Z⊂Rk s.t. GD(X,Z)>0. Let rhs(Y):= GD(X,Y)+GD(Y,Z) and define Yn := X ∪ {y1,y2,…,yn} such that Σi dist(yi,Z) < ∞. Then GD(X,Y)=0 and GD(Y,Z)0 for n∞ GD does not satisfy and relaxed triangle inequality since rhs(y)0. Note: for p>1, any set {y1,..,yn}⊂F(Q) (if compact) can be taken!! 1.) Normalization strategy of GD: Let A1={a} with dist(F(a),F(PQ))=1, i.e., GD(F(A1),F(PQ))=1 Now let An be the multiset consisting of n copies of a, An={a,…,a}, then ) 1 ,.., 1 ( )) ( ), ( ( → = = n n n P F A F GD

p p T Q n

slide-11
SLIDE 11

11

  • O. Schütze

New Variant of GD

p X i p i p p X i p i p

Y x dist X Y x dist X Y X GD

/ 1 | | 1 / 1 | | 1

) , ( 1 ) , ( 1 ) , ( ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

∑ ∑

= =

Nearby modification: take the power mean of the distances:

  • - same (poor) metric properties, but
  • - better averaging: GDp(F(An),F(PQ))=1 for all n∈N
  • - (needed for the upcoming indicator)
slide-12
SLIDE 12

12

  • O. Schütze

Discussion IGD

IGD as proposed by Coello & Cruz applied on general finite sets X, Y⊂Rk using dist:

p Y i p i X

y dist Y Y X IGD

/ 1 | | 1

) , ( 1 ) , ( ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

=

  • - same metric properties as GD since IGD(A,B) = GD(B,A)
  • - same modification: take power mean of the distances:

p Y i p i p p Y i p i p

X y dist Y X y dist Y Y X IGD

/ 1 | | 1 / 1 | | 1

) , ( 1 ) , ( 1 ) , ( ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

∑ ∑

= =

slide-13
SLIDE 13

13

  • O. Schütze

A “New” Indicator

( )

) , ( ), , ( max ) , ( Y X IGD Y X GD Y X

p p p

= Δ

Proposition 1: ∆p is a semi-metric for 1≤p<∞ and a metric for p=∞ Remark: for p=∞ the indicator ∆p coincides with dH Proposition 2: let |X|,|Y|,|Z|≤N, then

)) , ( ) , ( ( ) , ( Z Y Y X N Z X

p p p p

Δ + Δ ≤ Δ

Observation: GD(X,Y) is an ‘averaged version’ of dist(X,Y), same for IGD combine GD and IGD as for dH:

slide-14
SLIDE 14

14

  • O. Schütze

Interpretation of p for the Trade Off

p=1 p=2 p=5 p=10 p=∞ N=2 0.541 0.15 0.026 0.008 N=4 0.249 0.06 0.019 0.009 N=6 0.105 0.033 0.008 0.003 N=10 0.02 0.004 0.002 0.001 N=100

The larger the value of p, the ´nearer´Δp is to a metric (but: how to choose p? what is the influence of N?)

Table: Percentage of the triangle violations (σ=1) for different values of p. Hereby, we have taken 100,000 different sets A,B,C with |A|,|B|,|C|=N, k=2, each entry randomly chosen within [0,1].

slide-15
SLIDE 15

15

  • O. Schütze

Example

P hypothetical Pareto front X1 perfect approximation of P, except one outlier X2 none of the elements are ‘near’ to P Question: Which approxomation is ‘better’?

p=1 p=2 p=5 p=10 p=∞ ∆p(P,X1) 0.8182 2.714 4.047 5.571 9 ∆p(P,X2) 2.828 2.828 2.828 2.828 2.828

slide-16
SLIDE 16

16

  • O. Schütze

Extension to Continuous Models

M1 m1

f1 f2

M2 m2

2 1 1

] , [ R M m → γ

Now consider continuous models In general: k objectives PQ (k-1)-dimensional GDp: A finite, PQ compact GD turns to a continuous SOP

p M m p Q

dt A F t dist m M P F A F IGD

/ 1 1 1

1 1

)) ( ), ( ( 1 )) ( ), ( ( ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − =

γ

IGDp: PQ continuous the power mean of IGDp turns into an integral. Example: k=2, F(PQ) connected, then

slide-17
SLIDE 17

17

  • O. Schütze

Discretization of F(PQ)

Task: PQ given analytically, compute an approximation Y of F(PQ) with dH(Y,F(PQ))<δ (a priori defined approximation quality) For k=2: use continuation-like methods: select step size t such that ||F(x+tv)-F(x)||∞≈Θδ, Θ<1 a safety factor (selection of t based on Lipschitz estimations)

−4 −3 −2 −1 1 2 3 4 −0.2 0.2 0.4 0.6 0.8 1 1.2 f1 f2 PF −4 −3 −2 −1 1 2 3 4 −0.2 0.2 0.4 0.6 0.8 1 1.2 f1 f2 PF

δ=0.01 δ=0.4

OKA2

slide-18
SLIDE 18

18

  • O. Schütze

Numerical Example

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 pop1 pop2 pop3 pop4 pop5 pop6 pop7 Pareto Front

NSGA-II applied on ZDT1

Y = F(PQ) Yi=F(popi) ∆2(Y1,Y)=3.03 ∆2(Y2,Y)=2.71 ∆2(Y3,Y)=1.43 ∆2(Y4,Y)=0.77 ∆2(Y5,Y)=0.31 ∆2(Y6,Y)=0.12 ∆2(Y7,Y)=0.007

slide-19
SLIDE 19

19

  • O. Schütze

Discussion

Conclusions

  • New indicator ∆p proposed for the evaluation of MOEAs.
  • ∆p is a semi-metric, and a pseudo-metric for bounded

archive sizes

  • p can (in principle) be used to handle the ‘outlier trade off’

Open Questions

  • How to choose p?
  • How to measure the distance to a metric?
  • How to adapt the selection mechanisms in order to

improve ∆p? (∆p is NOT compliant with the dominance relation!)

slide-20
SLIDE 20

20

  • O. Schütze

Thank you for your attention!