Metaheuristic Applications to Telecoms, Bioinf, Software, - - PowerPoint PPT Presentation

metaheuristic applications to
SMART_READER_LITE
LIVE PREVIEW

Metaheuristic Applications to Telecoms, Bioinf, Software, - - PowerPoint PPT Presentation

Metaheuristic Applications to Telecoms, Bioinf, Software, and other Domains


slide-1
SLIDE 1
  • Metaheuristic Applications to

Telecoms, Bioinf, Software, and other Domains

  • !"#$"!#

%&""

University of Málaga, Spain

slide-2
SLIDE 2
  • !
  • Objective of a global optimization problem:
  • =

≤ ∈ ∀

  • !"#$"!#
  • Vectors can map to other data structures
  • Minimizing is also possible
slide-3
SLIDE 3
  • '
  • Where can optimization problems be found?
  • !"#$"!#
slide-4
SLIDE 4
  • %()*

+

  • ,

,- % ).

  • !"#$"!#

+

  • ,

/ ) % 1 ,) )+

  • 2

/ 3

  • 4%
slide-5
SLIDE 5
  • $
  • 5
  • %
  • !"#$"!#
slide-6
SLIDE 6
  • 6
  • Evolutionary Algorithm
  • !"#$"!#

while do

end while

slide-7
SLIDE 7
  • 7
  • 8#9!:.9:'9$;

→ 1

Convex Combination Metric Space

2 <

  • !"#$"!#
  • 89#:#9':

79!; 8#9!:.9:'9$; 897:#9':!9; → 1 → )* → / 0%

slide-8
SLIDE 8
  • =
  • Parallel:

Clusters, Grid computing, multicore, FPGAs, GPUs…

Four main ways of making an algorithm more efficient and accurate:

  • Hybrid:

Combining algorithms,

  • !"#$"!#

Combining algorithms,

  • perators, representations:

problem knowledge

  • Multiobjective:

Modelling explicitly several conflicting objective functions with Pareto’s concept of dominance

  • Dynamic:

Solve a problem that changes in time and adapt previous solutions to the new scenarios

slide-9
SLIDE 9
  • >

!

Paralellism and Metaheuristics:

The increasing availability of new kinds of CPUs and the parallel nature

  • f

metaheuristics have allowed the fast development of parallel metaheuristics

Advantages:

  • Allow to tackle more complex problems and/or larger instances
  • !"#$"!#
  • Allow to tackle more complex problems and/or larger instances
  • Allow to reduce the execution time
  • Allow to improve the quality of the found solutions

Examples

  • E. Alba (ed.), Parallel Metaheuristics: A New Class of Algorithms, , 2005
slide-10
SLIDE 10
  • #

"#$%!&

Hybridization is the inclusion of problem;dependant information in the algorithm Types: ' Strong

  • !"#$"!#

' Strong ' Weak Examples

slide-11
SLIDE 11
  • $()*&+

Most real word optimization problems require to optimize more than one single function

' Multiobjective Optimization Problems (MOPs)

Multobjective optimization searches for a set of solutions

' Pareto Optimal Set ' Their representation in the objective space is known as Pareto front

Metaheuristics provide a subset of the Pareto optimal set.

  • !"#$"!#

Metaheuristics provide a subset of the Pareto optimal set. Two goals

' Convergence to the true Pareto front ' Diversity of the solutions along the true Pareto front

slide-12
SLIDE 12
  • !

$()*&+ )*

, Fitness assignment ' Multiobjective metaheuristics assign a unique value to the solutions used as “fitness” to compare solutions ' E.g.: Ranking in NSGA;II or strength in SPEA2 , Maintaining diversity ' Additional information is needed to know the density of solutions around a given one

  • !"#$"!#

around a given one ' E.g.: Hypercube in PAES, crowding distance in NSGA;II , Elitism ' The general approach uses an auxiliary population, sometimes called , Quality indicators ' Metrics are needed to measure convergence and/or diversity Hypervolume (convergence and diversity) Generational Distance (convergence) Spread (diversity)

slide-13
SLIDE 13
  • '

)-

MOCell

' Multiobjective cellular genetic algorithm

Main features

' Use of an external archive ' 2;dimensions toroidal grid ' Archive feedback

  • !"#$"!#

Comparison against NSGA;II and SPEA2

' Competitive results in terms of convergence and hypervolume ' Best results concerning spread

slide-14
SLIDE 14
  • $.

AbYSS

' Archive based hYbrid Scatter Search

Basic idea

' Redefining the scatter search template to adapt it to multiobjective

  • ptimization
  • !"#$"!#
  • ptimization

Main features

' External Archive to maintain good solutions ' Individuals of the external archive are moved to initial set in the re;start loop

Comparison against NSGA;II and SPEA2

' Competitive results in terms of convergence and hypervolume ' Best results concerning spread

slide-15
SLIDE 15
  • $

$()*&+ /-&*%

Motivation:

  • There are many computers in the labs of the Computer Science Departament of the

University of Málaga

  • Currently we directly control up to 400 processors
  • Question: How can we use them together to solve multiobjective optimization problems?
  • Using known message passing libraries (sockets, PVM, MPI) is not a solution
  • Machines are idle in the nights and during the weekends (and in holydays)
  • Variable availability during the day
  • OUR APPROACH: using grid technologies
  • !"#$"!#
  • OUR APPROACH: using grid technologies

Issues of interest:

  • Easiness of installation and administration
  • Parallel programming models offered
  • Programming languages available
  • Use of idle CPU cycles (opportunistic computing)
  • Parallel performance

Grid computing systems used:

  • Condor
  • Globus
  • Others: ProActive, BOINC
slide-16
SLIDE 16
  • 6

)*&0% 12

, MANETs ' Stations usually are laptops, handholds, PDAs, or mobile phones ' Mobility of stations

  • dynamic topology of the network

, Metropolitan MANETs ' High Density Areas (HDA): areas with high station density ' HDAs can appear and disappear

  • !"#$"!#

' HDAs can appear and disappear from the network , Optimization Problem , Fine;tune of a broadcasting strategy called DFCN , Target: metropolitan MANETs , Multiobjective metaheuristics , EAs: NSGA;II, SPEA2, cMOGA , Scatter Search: AbYSS , PSO: MOPSO

slide-17
SLIDE 17
  • 7

&34#% / 56

, Problem definition ' Allocate frequencies (few dozens) to elementary transceivers (TRXs) of the network (thousands) ' Frequency reuse is mandatory

  • this provokes interference
  • QoS degradation

' Real world instances of GSM networks currently in use

  • !"#$"!#

, GSM architecture ' Base Transceiver Stations ' Sectors ' TRXs , Metaheuristics used , (1,λ λ λ λ) EA, ACO , Ongoing: , ssGA , Grid computing with Condor

  • )!??
  • )!
  • *

@* * 8*+!A;

slide-18
SLIDE 18
  • =

&-% 7/89/56

, Problem definition ' Positioning the sites of the network ' Dimensioning a set of parameters for each antenna ' Real world instances of cellular networks ' Highly demanding computational costs , Objectives ' Number of sites

  • !"#$"!#

' Number of sites ' Quality of service ' Interferences ' Traffic hold , Metaheuristics used , PAES, ssGA , New work in: , AbYSS , MOCell , Gridifying with Condor/MW

slide-19
SLIDE 19
  • >

:56;% $&

Task:

Select geographical locations for cellular antennae from a list of 149 to 349 available locations

Objectives:

Maximize the covered area (coverage) Minimize the number of antennae

Fitness:

with

  • !"#$"!#

Fitness: Models:

Terrain ;> Square grid with 287×287 points Antenna coverage ;> Three shapes with

slide-20
SLIDE 20
  • !#

:56;% %!&

SA:

Trajectory technique Uses mutation to generate S from S Replaces S with S with probability

CHC:

Evolutionary algorithm

  • !"#$"!#

CHC:

Evolutionary algorithm No mutation, HUX crossover with incest prevention Elitist selection, Cataclysmic mutation (restarting procedure)

GA:

Evolutionary algorithm Mutation, two;point crossover Selection: Ranking (parents) + tournament (new population)

  • %*

%%

  • %%
slide-21
SLIDE 21
  • !

:56;% :

3000 3500 4000 4500 5000

f evaluations (x10^3)

SA CHC gGA ssGA dssGA8 (ref)

)&

6000 7000 8000 9000 10000

  • f evaluations

10^3)

SA CHC gGA ssGA

)&B

  • !"#$"!#

500 1000 1500 2000 2500

Number of eva 149 199 249 299 349 Instance size

1000 2000 3000 4000 5000 6000

Number of eva (x10^3)

149 199 249 299 349

Instance size

$6' !6!!=! =7#'= !#$$'$= '$'!'6 '#'> 7=6! =$>$ !!==$ '=#='

, CHC finds the optimal solution with less effort than SA and GA

for the five instance sizes using both antenna types , The geometry of the problem affects the complexity

slide-22
SLIDE 22
  • !!

<56 $&

Task:

Select geographical locations for sensor nodes Locations may be chosen from a list, or freely

Objectives:

Nodes must form a connected network Maximize the coverage Minimize the number of transmissions (energy)

  • !"#$"!#

Models:

Node → (RSENS,RCOMM) Network →

slide-23
SLIDE 23
  • !'

<56 :

Algorithms:

CHC gets the best results in binary coded instances SA gets the best results in real coded instances

Networks:

Network layouts obtained for different sensor nodes

RCOMM 20 RCOMM 40

  • !"#$"!#

RSENS 20 RSENS 40

slide-24
SLIDE 24
  • !

12 !,!56

  • !"#$"!#
slide-25
SLIDE 25
  • !$

12 !,!56

  • !"#$"!#
slide-26
SLIDE 26
  • !6

0=& 3%&&$#$&

  • !"#$"!#
slide-27
SLIDE 27
  • !7

0=& 3%&&$#$&

  • The main phases performed by assemblers are:
  • 20

130

  • 1

Overlap 2 Layout 3 Consensus

  • !"#$"!#
  • 130
  • Output of Assemblers:

Our approaches:

' Parallel metaheuristics ' Specialized heuristics (PALS)

Our results:

' Able to solve very large sequences ' Large efficacy and efficiency

slide-28
SLIDE 28
  • !=

0=& /

Problem:

Gene selection and cancer classification of DNA Microarray, Feature selection definition:

Objectives:

Maximize accuracy of prediction Minimize the number of selected genes Maximize ROC factors (sensibility and specificity)

  • !"#$"!#

Maximize ROC factors (sensibility and specificity) Phases: Feature selection, training, validation, fitness calculation

slide-29
SLIDE 29
  • !>

0=& /

Fitness: Monobjective: aggregative ( ) Multiobjective: ' 2 objs ( ) ' 3 objs ( )

Classification: 2 Classifiers:

  • !"#$"!#

Classification: 2 Classifiers: ' Support Vector Machines ' !;Nearest Neighbors

Validation: Cross;validation. ' Leave One Out CV ' 10;fold CV

Algorithms: Metaheuristics ' Binary Geometric PSO for feature selection ' GA with SSOCF crossover for feature selection

slide-30
SLIDE 30
  • '#

0=& /

Instances:

Available large scale datasets of well;known cancer DNA Microarrays: Leukemia AML;ALL, Colon, Prostate, Lung, Ovarian, Breast (e.g. breast 24481 genes and 78 patient samples)

Results:

Comparison against other techniques in the literature

  • !"#$"!#

Dataset PSO GA Huerta et al. Juliusdot ir et al. Deb et al. Guyon et al. Yu et al. Liu et al. Shen et al. "#$ >>?999 >>?@>A 7B , A 100(2) C@?AAA , , % 90.64(4) C>?7A , , , , @>?9CD@ , , & 100(3) A >>?A>A?79@ >@@ >CA >9?BBA CB?AC, >AA "' A 97.38(3) , , , , >C?9AD , , ( A 100(3) , , , , , >>?7@B ,

  • 100(4)

100(4) , CC?CC7 , , , , ,

C<2

  • 2)&
slide-31
SLIDE 31
  • '

0=& /-&*%

  • !"#$"!#

At present: using a stable grid with 300 computers at UMA In progress: stable grid of 600 computers at UMA Next step: connect to global grids in Europe

slide-32
SLIDE 32
  • '!

3%=#*#

  • #&

, Objective: find a counterexample for a safety property in a concurrent model , Safety properties are those expressed by an LTL formula of the form: where is a past formula (with only past operators)

= □

  • !"#$"!#

, Finding one counterexample ≡ finding one accepting state in the intersection Büchi automaton (graph exploration problem)

  • A

@ D 7

  • C

> B 9

Intersection automaton Safety Properties Deadlocks Invariants Assertions …

slide-33
SLIDE 33
  • ''

3%=#*#

  • #&
  • A

@ D 7

  • B

9

Memory , Number of states very large even for small models

  • !"#$"!#

C >

, For example: Dijkstra Dining Philosophers

  • philosophers → 3 states
  • 20 philosophers → 1039 GB for

storing the states

slide-34
SLIDE 34
  • '

3%=#*#

  • #&

, ACOhg is a new Ant Colony Optimization model that can be applied to optimization problems with an unknown and/or very large construction graph Who can really find errors?

  • !"#$"!#

, ACOhg is a very robust algorithm for this problem and it

  • utperforms traditional algorithm from the model checking

domain

slide-35
SLIDE 35
  • '$

=52%

, Objective: propose a good set of test cases for a program , The previous objective is too fuzzy, one concrete objective is: find a test case set fulfilling a test adequacy criterion , Some examples of test adequacy criterion are:

  • !"#$"!#

Statement Coverage Branch Coverage Condition;Decision Coverage All the statements executed All the branches taken All the atomic conditions and decisions true and false Stronger

slide-36
SLIDE 36
  • '6

=52%

' After codification, software products require a test phase ' The objective is to find errors and to ensure software correctness ' Software companies dedicate 50% of resources to this task 1.0, 2.3 1.0, 2.3

  • !"#$"!#

' We propose an automatic tool to generate the input data for the test 10.5 Ok! 2.7, 5.4 2.7, 5.4 15.0 Wrong!

slide-37
SLIDE 37
  • '7

=52%

, The global objective is broken down in small partial objectives c1 true c1 false c2 true c3 false c3 true c2 false Six partial objectives

  • ndition;decision

coverage

  • !"#$"!#

c3 false c3 true condit cove Function minimization problem Input data Fitness Partial objective (c3 true) Current input data

slide-38
SLIDE 38
  • '=

=52%

%& ) 1 /

  • ?

1?

  • ?

1?

  • ?

1?

  • >9?>C

7>B?@@ 99.84 2370.03 >>?D@ 97>?A@

  • 100.00

@>?99 >C?C 9DD?A@ >?> 75.03

, Some results with PSO, ES, and GA (corrected condition coverage)

  • !"#$"!#
  • 88.89

9C?9 C9?99 13.27 C9?99 C9?7

  • 97.56

116.90 97.56 9B?D9 97.56 B99?9

  • 100.00

165.67 >>?>A 799@?9 >D?@7 @D?D9

  • >@?@@

ADC?@ 98.17 307.77 >D?A7 >@?>

, PSO and ES have similar efficacy , The coverage obtained by GA is always reached or outperformed by PSO or ES in all the cases

slide-39
SLIDE 39
  • '>

)!*6%

' ) rectangular pieces i with a height i and a width *i and a rectangular container (the strip) with width and unbounded height. ' Objective: To allocate all the pieces into the strip E without overlaping, E without rotating, E with their edges parallel to the edges of the strip, E Bottom;up, minimizing the height of the used strip. (Eq: To find a packing pattern that fulfils all these requirements) ' Restriction: three;stage guillotine patterns. ' Scientific interest: NP;hard problem.

  • !"#$"!#

' Scientific interest: NP;hard problem. ' Applications: Paper, cloth, wood, and glass industries. Chromosomes: sequences (permutations) of pieces which define the input for a layout algorithm. Layout algorithm: a next fit heuristic that generates three;stage guillotine patterns. Fitness function:

  • ?

? ?

= π

Representation

slide-40
SLIDE 40
  • #

)!*6%

Best Inherited Level Recombination: Transmits the levels with the highest filling rate from one parent to the child. Mutation: Best and Worst Stripe Exchange (BW_SE). Pieces of the best level are allocated in the first positions while the pieces of the worst level are asigned to the last positions.

Evolution Step

  • !"#$"!#

Adjustment Operator: Applies a First Fit heuristic and the

  • btained layout is codified in a chromosome.

Initial Seeding

slide-41
SLIDE 41
  • )!F%

DC)2)

  • !"#$"!#
slide-42
SLIDE 42
  • !

)!;#&$&

  • !"#$"!#
slide-43
SLIDE 43
  • '

)!;#&$&

! "# $% &" ! '()*+!,+*+-+)

  • !"#$"!#
slide-44
SLIDE 44
  • 1=

E*

%&""

  • !"#$"!#