On the Landscape of a Problem of Finding Satisfactory Metaheuristics - - PowerPoint PPT Presentation

on the landscape of a problem of finding satisfactory
SMART_READER_LITE
LIVE PREVIEW

On the Landscape of a Problem of Finding Satisfactory Metaheuristics - - PowerPoint PPT Presentation

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work On the Landscape of a Problem of Finding Satisfactory Metaheuristics Jos e M. Cecilia, Baldomero Imbern on Bioinformatics and High


slide-1
SLIDE 1

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

On the Landscape of a Problem of Finding Satisfactory Metaheuristics

Jos´ e M. Cecilia, Baldomero Imbern´

  • n

Bioinformatics and High Performance Computing Research Group (BIO-HPC), Polytechnic School, Universidad Cat´

  • lica San Antonio of Murcia (UCAM), Spain

Jos´ e-Mat´ ıas Cutillas-Lozano, Domingo Gim´ enez

Department of Computing and Systems, University of Murcia, Spain

XIII Congreso Espa˜ nol de Metaheur´ ısticas, Algoritmos Evolutivos y Bioinspirados, Granada, 24 octubre 2018

slide-2
SLIDE 2

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Outline

1

Motivation

2

The underlying problems

3

The hyperheuristic

4

Experiments

5

Conclusions and future work

slide-3
SLIDE 3

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Motivation

Hyperheuristic on top of a metaheuristic scheme: repeated application of metaheuristics to the optimization problem to be solved → high computational cost. Landscape analysis to guide the hyperheuristic. Two problems as case studies:

Molecule-Docking Problem (MDP). Determination of Kinetic Constants in a chemical reaction (KCP).

slide-4
SLIDE 4

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Molecule-Docking Problem

Virtual screening processes based on the calculation of a scoring function to measure the interaction between a set of chemical compounds (ligands) and a protein (receptor). Several points in the receptor (spots), where ligands may independently couple.

slide-5
SLIDE 5

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Molecule-Docking Problem

Fitness or scoring function calculates the binding energy between the atoms of the protein and the ligand, e.g. the Lennard-Jones potential: V (i, j) = 4ǫ

  • σ

r(i, j) 12 −

  • σ

r(i, j) 6 where σ and ǫ are empirical constants of the model, and r(i, j) is the distance between atoms i and j. The search space is determined by the degrees of freedom of the protein and the ligand (six, three for translation and three for the rotation movements of the ligand) and the flexibility junctions of the ligand.

slide-6
SLIDE 6

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Determination of Kinetic Constants

The search for the kinetic parameters of a chemical reaction in heterogeneous phase. Depending on the pH, three main ways in which the dissolution of calcium carbonate occurs:

By reaction with acetic acid. CaCO3 + H3O+ ↔ Ca2+ + HCO−

3 + H2O

By reaction with carbonic acid. CaCO3 + H2CO3 ↔ Ca2+ + 2 · HCO−

3

And by the hydrolysis reaction. CaCO3 + H2O ↔ Ca2+ + HCO−

3 + OH−

slide-7
SLIDE 7

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Determination of Kinetic Constants

When the reaction occurs in several independent pathways, the overall rate is simply the sum of all the individual rates. So, the kinetic of dissolution of calcium carbonate is a function of the concentration of carbonic acid in the solution, the pH and the mass transfer area: 1 V dNCa2+ dt = −k1an1 H3O+n2 − k2an3 [H2CO3]n4 − k3

k1, k2 and k3 are the combined reaction rate constants. n1, n2, n3 and n4 are the reaction orders. a is the area of the tablet.

slide-8
SLIDE 8

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Determination of Kinetic Constants

An individual is represented by a real vector of size seven, which contains the set of kinetic constants. The ranges of values for the constants are set according to empirical criteria. Every time the fitness of an individual is calculated, the whole chemical system is solved: for i = 0 → N do Calculate at instant i:

  • Ca2+

, a, [H3O+] , [HCO−] , [H2CO3] , pHcal, ∆

  • Ca2+

, [CH3COOH] , [CH3COO−] Fitness = Fitness + (pHexp,i − pHcal,i)2 end for

slide-9
SLIDE 9

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Parameterized metaheuristic schema

Initialize(S,ParamIni) while (not EndCondition(S,ParamEnd)) SS = Select(S,ParamSel) SS1 = Combine(SS,ParamCom) SS2 = Improve(SS1,ParamImp) S = Include(SS2,ParamInc) The values of the metaheuristic parameters determine the metaheuristic or combination of metaheuristics. Hyperheuristics are implemented with the same schema, and search for satisfactory metaheuristics implemented with the schema (satisfactory values of the metaheuristic parameters).

slide-10
SLIDE 10

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Structure of the hyperheuristic

slide-11
SLIDE 11

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Metaheuristic parameters

routine parameter meaning MDP KCP Initialize INEIni Initial Number of Elements X X PEIIni Percentage of Elements to Improve X X IIEIni Intensification in the Improvement of Elements X X IIEFlex Intensification due to Flexibility X NBEIni Number of Best Elements for the next iterations X X NWEIni Number of Worst Elements for the next iterations X Select NBESel Number of Best Elements for combination X X NWESel Number of Worst Elements for combination X X Combine NBBCom Number of Best-Best elements combinations X X NBWCom Number of Best-Worst elements combinations X X NWWCom Number of Worst-Worst elements combinations X X Improve PEIImp Percentage of Elements generated to be Improved X X IIEImp Intensification in the Improvement of Elements generated X X IIEFlex same value as in Initialize X PEDImp Percentage of Elements to be Diversified and improved X X IIDImp Intensification in the Improvement of Elements Diversified X X IIEFlex same value as in Initialize X Include NBEInc Number of Best Elements to include in the reference set X X

slide-12
SLIDE 12

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Basic functions

Initialize: In MDP one set per spot. In MDP selection of the best and worst elements for the next

  • iterations. In KCP selection of the best elements, and the

reference set is completed with elements selected randomly. Combine: MDP: an element obtained as the mean of the parameters of the parents. KCP: crossing at a middle point. Improvement functions: The same parameter IIEFlex is used in the three improvement functions of MDP to search for neighbors by rotation on the junctions.

slide-13
SLIDE 13

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Hyperheuristic implementation

Metaheuristic parameters: MDP: INEIni between 20 and 200, for other parameters between 0 and 100. KCP: INEIni and FNEIni between 20 and 200; intensification parameters between 0 and 50; other parameters between 0 and 100. Fitness obtained through the application of each metaheuristic to just one instance of the problem (for low experimentation time). The hyperheuristic with the same parameterized schema, with smaller values for the parameters. Combination crossing at a middle point. When an invalid configuration is generated, it is discarded. Neighbors obtained by adding or subtracting 1 in one position of the vector of metaheuristic parameters. Diversification generating a random value for one position on the vector.

slide-14
SLIDE 14

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Experiments set-up

For each problem, analysis of the fitness for three instances of the problem and 100 metaheuristics. Instances for MDP: pair #atoms receptor #atoms ligand #junctions ACE 9198 59 13 GPB 13261 29 1 PARP 5588 32 3 Fitness recorded at intervals of 30 seconds, from 30 to 600, for MDP 5 seconds, from 5 to 100, for KCP.

slide-15
SLIDE 15

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Aspects to be analyzed

We want to reduce the execution time of the hyperheuristic. We analyze: Influence of each parameter on the fitness (could reduce the search interval of the parameters). Variation of the influence of the parameters with the execution time (could limit the application time of metaheuristics). Comparison of the results with different problems (the number

  • f training problems could be reduced).
slide-16
SLIDE 16

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the fitness with the distance to the best metaheuristic

Variation of the fitness with the distance to the optimum vector. d(v, w) = p

i=1

  • |vi −wi |

ui −li

  • p

p is the number of parameters, ui and li the upper and lower values of the range for the generation of parameter i. Dotted: single metaheuristics; Thick: in groups of ten metaheuristics. MDP and PARP, after 600 seconds. KCP and EXP1, after 100 seconds.

slide-17
SLIDE 17

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, MDP

slide-18
SLIDE 18

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, MDP

The most influential parameters: NEIFlex for the configurations with more flexible junctions;

slide-19
SLIDE 19

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, MDP

The most influential parameters: NEIFlex for the configurations with more flexible junctions; NBEInc (include the best elements);

slide-20
SLIDE 20

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, MDP

The most influential parameters: NEIFlex for the configurations with more flexible junctions; NBEInc (include the best elements); PEDImp (diversification preferable to intensification);

slide-21
SLIDE 21

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, MDP

The most influential parameters: NEIFlex for the configurations with more flexible junctions; NBEInc (include the best elements); PEDImp (diversification preferable to intensification); and surprisingly NWWCom (maybe combination of the worst elements helps diversify).

slide-22
SLIDE 22

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, KCP

slide-23
SLIDE 23

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, KCP

The most influential parameters: Initialization plays a relevant role, not FNEIni;

slide-24
SLIDE 24

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, KCP

The most influential parameters: Initialization plays a relevant role, not FNEIni; IDEImp has a negative influence;

slide-25
SLIDE 25

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Evolution of the correlation coefficient of the parameters with the fitness, KCP

The most influential parameters: Initialization plays a relevant role, not FNEIni; IDEImp has a negative influence; Intensification (not diversification) has positive influence (PEIImp, IIEImp).

slide-26
SLIDE 26

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Correlation coefficient of the fitness at different time-steps with respect to the fitness

50 100 150 200 250 300 350 400 450 500 550 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

correlation coefficient

MDP

ACE GPB PARP

10 20 30 40 50 60 70 80 90 0.5 0.6 0.7 0.8 0.9 1

correlation coefficient

KCP

EXP1 EXP2 EXP3

Colleration higher than 0.9 after 250 seconds for the MDP and after 60 seconds for the KCP. The training time could be reduced to approximately half the time used in the experiments.

slide-27
SLIDE 27

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Mean of the final fitness of the metaheuristics discarded at different times-steps

50 100 150 200 250 300 350 400 450 500 550 600 −550 −500 −450 −400 −350 −300

final mean fitness

MDP

ACE GPB PARP

10 20 30 40 50 60 70 80 90 100 1.5 2 2.5 3 3.5 4

final mean fitness

KCP

EXP1 EXP2 EXP3

If the five metaheuristics with the worst fitness are discarded at each step, the total training time is halved, and no difference is appreciated in the final fitness.

slide-28
SLIDE 28

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Conclusions

Hyperheuristics on top of parameterized metaheuristics can be used to determine satisfactory metaheuristics for optimization problems. A fitness landscape analysis can help to reduce the search of the hyperheuristic and to reduce its execution time. The most influential metaheuristic parameters are determined, and the guide is centered on them. The influence of the parameters depends on their meaning and how the basic functions are implemented. So, the results differ for the two case studies considered. The methodology can be applied for other problems.

slide-29
SLIDE 29

Motivation The underlying problems The hyperheuristic Experiments Conclusions and future work

Future work

Analysis of the reduction of the execution time and of the goodness of the solution for guided hyperheuristics. Application of the methodology to other optimization problems. Alternatively, statistical studies to determine the influence of the parameters.