some comments on gd and igd and relations to the
play

Some Comments on GD and IGD and Relations to the Hausdorff Distance - PowerPoint PPT Presentation

1 Some Comments on GD and IGD and Relations to the Hausdorff Distance O. Schtze, X. Esquivel, A. Lara , C. Coello CINVESTAV-IPN Centro de Investigacin y de Estudios Avanzados del Instituto Politcnico Nacional. Mexico City, Mexico O.


  1. 1 Some Comments on GD and IGD and Relations to the Hausdorff Distance O. Schütze, X. Esquivel, A. Lara , C. Coello CINVESTAV-IPN Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional. Mexico City, Mexico O. Schütze

  2. 2 Outline Introduction and Background • Trade off for the design of indicators for the evaluation of MOEAs • Metric / Hausdorff distance Investigation of the Indicators • GD • IGD A ‘New’ Indicator • Metric properties • Extension to continuous models O. Schütze

  3. 3 Multi-Objective Optimization Multi-Objective Optimization Problem f 1 ,f 2 ⎧ ⊂ → : n f Q R R 1 ⎪ = min � (MOP) ⎨ F ⎪ ⊂ → : n ⎩ f Q R R k x Pareto set f 2 P Q = set of optimal solutions ( Pareto set ) F(P Q ) = the image of P Q ( Pareto front ) Pareto front First we consider discrete (or discretized) models, i.e., |Q|< ∞ . f 1 O. Schütze

  4. 4 Outliers in Stochastic Search Algorithms Example : Consider the MOP ( ε ,x 2 ) → : [ 0 , 1 ] n k F R ⎛ ⎞ x ⎜ ⎟ = 1 ( ) F x ⎜ ⎟ ( ) ⎝ ⎠ g x where g:[0,1] n � R k-1 ( � Okabe, ZDT). Assume a point x=( ε ,z), z ∈ [0,1] n-1 , is a member of the archive/population. Further, assume that new candidate solutions are chosen uniformly at random from the domain. Then the probability to find a point that dominates x is less than ε ( � objective 1). The distance of x to P Q can be ‘large’. O. Schütze

  5. 5 Example P hypothetical Pareto front X 1 perfect approximation of P, except one outlier X 2 none of the elements are ‘near’ to P Question: Which approxomation is ‘better’? Extreme situations: -- pessimistic view (Hausdorff distance): d H (X 1 ,P)=9, d H (X 2 ,P)=2.83 -- averaged result (Generational distance): GD(X 1 ,P)=0.81 , GD(X 2 ,P)=2.83 O. Schütze

  6. 6 Outlier Trade Off Trade off for the indicator D when measuring results of MOEAs (the design of MOEAs is influenced by D): Use of a Metric Averaging the Results + greedy search = shortest + Single outliers do not have path to the set of interest a mayor influence on the ( � triangle inequality) result -- Penalization of single -- The greedy search is not outliers of the candidate set neccessarily the shortest path to the set of interest O. Schütze

  7. 7 Metric Definition : Suppose X is a set and d:X × X � R is a function. Then d is called a metric on X if, and only if, for each a,b,c ∈ X: ≥ = ⇔ = ( ) ( , ) 0 and ( , ) 0 (Positive Property) a d a b d a b a b = ( ) ( , ) ( , ) (Symmetric Property) b d a b d b a ≤ + (Triangle Inequality) ( ) ( , ) ( , ) ( , ) c d a c d a b d a c Variants: -- d is called a semi-metric if properties (a) and (b) are satisfied -- A pseudo-metric is a semi-metric that satisfies the relaxed triangle inequality: ≤ σ + σ ≥ ( , ) ( ( , ) ( , )), 1 d a c d a b d a c O. Schütze

  8. 8 Hausdorff Distance Definition : Let u,v ∈ R n , A,B ⊂ R n , and ||.|| be a vector norm. A The Hausdorff distance d H is defined as follows: = − ( ) ( , ) : inf a dist u A u v u ∈ v A = ( ) ( , ) : sup ( , ) b dist B A dist u A ∈ A u B = ( ) ( , ) : max( ( , ), ( , )) c d A B dist A B dist B A B H Remarks: (i) dist(A,B) is not symmetric: if B is a proper subset of A, then it is dist(B,A) =0 and dist(A,B) >0. (ii) d H is a metric on the set of discrete sets. It can also be used for continuous spaces. In that case it is d H (A,B)=0 ⇔ clos(A)=clos(B) O. Schütze

  9. 9 Discussion of GD (1) GD as proposed by Van Veldhuizen applied on general finite sets X, Y ⊂ R k using dist : 1 / p ⎛ ⎞ | | 1 X ∑ = ⎜ ⎟ ( , ) ( , ) p GD X Y dist x i Y ⎜ ⎟ ⎝ ⎠ X = 1 i Metric properties : -- positive property: NO it is GD(X,Y)=0 ⇔ X ⊂ Y (X can be a proper subset of Y (*)) -- symmetric property: NO (*): then GD(X,Y)=0 but GD(Y,X)>0 -- triangle inequality: NO ( � next slide) O. Schütze

  10. 10 Discussion of GD (2) 1.) Normalization strategy of GD: Let A 1 ={a} with dist(F(a),F(P Q ))=1, i.e., GD(F(A 1 ),F(P Q ))=1 Now let A n be the multiset consisting of n copies of a, A n ={a,…,a}, then ( 1 ,.., 1 ) T p n = = → ( ( ), ( )) p 0 GD F A F P n Q n n 2.) Investigate (relaxed) triangle inequality: let X,Z ⊂ R k s.t. GD(X,Z)>0. Let rhs(Y):= GD(X,Y)+GD(Y,Z) and define Y n := X ∪ {y 1 ,y 2 ,…,y n } such that Σ i dist(y i ,Z) < ∞ . Then GD(X,Y)=0 and GD(Y,Z) � 0 for n � ∞ � GD does not satisfy and relaxed triangle inequality since rhs(y) � 0. Note : for p>1, any set {y 1 ,..,y n } ⊂ F(Q) (if compact) can be taken!! O. Schütze

  11. 11 New Variant of GD Nearby modification: take the power mean of the distances: 1 / p 1 / ⎛ ⎞ p ⎛ ⎞ | | | | 1 1 X X ∑ ∑ ⎜ ⎟ = = ⎜ ⎟ ( , ) ( , ) ( , ) p p GD X Y dist x Y dist x Y ⎜ ⎟ ⎜ ⎟ p i i ⎝ ⎠ X ⎝ ⎠ p X = = 1 1 i i -- same (poor) metric properties, but -- better averaging: GD p (F(A n ),F(P Q ))=1 for all n ∈ N -- (needed for the upcoming indicator) O. Schütze

  12. 12 Discussion IGD IGD as proposed by Coello & Cruz applied on general finite sets X, Y ⊂ R k using dist : 1 / p ⎛ ⎞ | | 1 Y ∑ = ⎜ ⎟ ( , ) ( , ) p IGD X Y dist y i X ⎜ ⎟ ⎝ ⎠ Y = 1 i -- same metric properties as GD since IGD(A,B) = GD(B,A) -- same modification: take power mean of the distances: 1 / p 1 / ⎛ ⎞ p ⎛ ⎞ | | | | 1 1 Y Y ∑ ∑ ⎟ ⎜ = = ⎜ ⎟ ( , ) ( , ) ( , ) p p IGD X Y dist y X ⎜ dist y X ⎟ ⎜ ⎟ p i i ⎝ ⎠ Y ⎝ ⎠ p Y = = 1 1 i i O. Schütze

  13. 13 A “New” Indicator Observation: GD(X,Y) is an ‘averaged version’ of dist(X,Y), same for IGD � combine GD and IGD as for d H : ( ) Δ = ( , ) max ( , ), ( , ) X Y GD X Y IGD X Y p p p Proposition 1: ∆ p is a semi-metric for 1 ≤ p< ∞ and a metric for p= ∞ Remark: for p= ∞ the indicator ∆ p coincides with d H Proposition 2: let |X|,|Y|,|Z| ≤ N, then Δ ≤ Δ + Δ ( , ) p ( ( , ) ( , )) X Z N X Y Y Z p p p O. Schütze

  14. 14 Interpretation of p for the Trade Off Table : Percentage of the triangle violations ( σ =1) for different values of p. Hereby, we have taken 100,000 different sets A,B,C with |A|,|B|,|C|=N, k=2, each entry randomly chosen within [0,1]. p= ∞ p=1 p=2 p=5 p=10 N=2 0.541 0.15 0.026 0.008 0 N=4 0.249 0.06 0.019 0.009 0 N=6 0.105 0.033 0.008 0.003 0 N=10 0.02 0.004 0.002 0.001 0 N=100 0 0 0 0 0 � The larger the value of p, the ´nearer´ Δ p is to a metric (but: how to choose p? what is the influence of N?) O. Schütze

  15. 15 Example P hypothetical Pareto front X 1 perfect approximation of P, except one outlier X 2 none of the elements are ‘near’ to P Question: Which approxomation is ‘better’? p= ∞ p=1 p=2 p=5 p=10 ∆ p (P,X1) 4.047 5.571 9 0.8182 2.714 ∆ p (P,X2) 2.828 2.828 2.828 2.828 2.828 O. Schütze

  16. 16 Extension to Continuous Models f 2 Now consider continuous models γ → 2 [ , ] m M R M 2 1 1 In general: k objectives � P Q (k-1)-dimensional GD p : A finite, P Q compact m 2 � GD turns to a continuous SOP f 1 m 1 M 1 IGD p : P Q continuous � the power mean of IGD p turns into an integral. Example: k=2, F(P Q ) connected, then 1 / p ⎛ ⎞ 1 ∫ M ⎜ ⎟ = γ 1 ( ( ), ( )) ( ( ), ( )) p IGD F A F P dist t F A dt ⎜ ⎟ − Q ⎝ ⎠ M m m 1 1 1 O. Schütze

  17. 17 Discretization of F(P Q ) Task : P Q given analytically, compute an approximation Y of F(P Q ) with d H (Y,F(P Q ))< δ (a priori defined approximation quality) For k=2 : use continuation-like methods: select step size t such that ||F(x+tv)-F(x)|| ∞ ≈Θδ , Θ <1 a safety factor (selection of t based on Lipschitz estimations) 1.2 1.2 OKA2 PF PF 1 1 0.8 0.8 0.6 0.6 f 2 f 2 0.4 0.4 0.2 0.2 0 0 −0.2 −0.2 −4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4 f 1 f 1 δ =0.01 δ =0.4 O. Schütze

  18. 18 Numerical Example 5 pop1 pop2 Y = F(PQ) pop3 4.5 pop4 pop5 Yi=F(popi) pop6 4 pop7 Pareto Front 3.5 ∆ 2 (Y1,Y)=3.03 ∆ 2 (Y2,Y)=2.71 3 ∆ 2 (Y3,Y)=1.43 2.5 ∆ 2 (Y4,Y)=0.77 2 ∆ 2 (Y5,Y)=0.31 1.5 ∆ 2 (Y6,Y)=0.12 1 ∆ 2 (Y7,Y)=0.007 0.5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NSGA-II applied on ZDT1 O. Schütze

  19. 19 Discussion Conclusions • New indicator ∆ p proposed for the evaluation of MOEAs. • ∆ p is a semi-metric, and a pseudo-metric for bounded archive sizes • p can (in principle) be used to handle the ‘outlier trade off’ Open Questions • How to choose p? • How to measure the distance to a metric? • How to adapt the selection mechanisms in order to improve ∆ p ? ( ∆ p is NOT compliant with the dominance relation!) O. Schütze

  20. 20 Thank you for your attention! O. Schütze

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend