genetic algorithm Antonio Gonzalez-Pardo, David Camacho David - - PowerPoint PPT Presentation

genetic algorithm
SMART_READER_LITE
LIVE PREVIEW

genetic algorithm Antonio Gonzalez-Pardo, David Camacho David - - PowerPoint PPT Presentation

Maximal Component detection in graphs using swarm-based and genetic algorithm Antonio Gonzalez-Pardo, David Camacho David Camacho david.camacho@uam.es A pplied I ntelligence & D ata A nalysis http://aida.ii.uam.es Universidad Autnoma de


slide-1
SLIDE 1

Applied Intelligence & Data Analysis http://aida.ii.uam.es Universidad Autónoma de Madrid

Maximal Component detection in graphs using swarm-based and genetic algorithm

Antonio Gonzalez-Pardo, David Camacho

David Camacho david.camacho@uam.es

slide-2
SLIDE 2

Outline

 Introduction  Goals  Bio-inspired approaches applied to subgraphs

 The ACO approach  The GA approach

 Experiments  Problems observed  Why we have done this work?

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Introduction

 Past decades have shown a growing interest on Collective Intelligence and Evolutionary Algorithm.  Collective Intelligence (CI) uses concepts extracted from the social behavior observed in sets of self-organizing communities like ants, bees or bacterias amongst others.  Evolutionary Algorithms (EA) are based on the evolution that allows the generation of new individuals based on the characteristics of their parents.

slide-5
SLIDE 5

Introduction

 Both algorithms have some similarities:

 Are based on heuristic search.  The population explores the solution space of the problem modelled.  Are usually applied to NP-complete or NP-hard problems.

 Some problems where EA and CI are applied:

 Routing problems  Constraint Satisfaction Problems  Scheduling problems.

slide-6
SLIDE 6

Goals

slide-7
SLIDE 7

Goals

1. In this work we have made an experimental comparison between a classical Genetic-based approach and a Swarm-based strategy applied to the detection of the maximal connected component of a graph. 2. The maximal connected component is composed by the maximum number of nodes of a graph in such a way that from any node there exist a path to any other node from the same set.

slide-8
SLIDE 8

Bio-inspired approaches applied to subgraphs detection

The ACO approach

slide-9
SLIDE 9

The ACO approach

 In this approach, ants (agents) travel through the network trying to visit all the nodes.  Each ant starts in a random node.  Initially, neighbours of the current node have the same probability to be selected but this probability change during the execution of the system due to the pheromone values deposited by the ants.

slide-10
SLIDE 10

The ACO approach

 Once any ant decides what will be the next node to visit, the ant put a pheromone in the graph.  Pheromones are sensed by ants while travelling through the network and attract ants to follow trails with high pheromone concentrations (values).  Also, there is an evaporation process that reduces the different pheromone concentrations in the graph. This process is very useful to forget bad decisions taken previously.

slide-11
SLIDE 11

The ACO approach

 Given an ant located in node i, the probability of travelling from node i to node j is: Where:  is the pheromone concentration between nodes i and j  is 0 if node j has just been visited by the ant.  is the neighborhood of node i

slide-12
SLIDE 12

The ACO approach

slide-13
SLIDE 13

Bio-inspired approaches applied to subgraphs detection

The GA approach

slide-14
SLIDE 14

The GA approach

 In this approach, individuals (agents) represents possible solution to the problem.  Individuals contain a phenotype that contains a set of genes with different nodes names. Both, the length of the phenotype and the gene values are randomly selected.  The phenotype represents a possible path that will be evaluated against the graph. This evaluation provides a fitness value and it will be used in the generation of the next population.

slide-15
SLIDE 15

The GA approach

 The fitness function is the same as the one used in ACO to determine the goodness of a path:  The validation process is perform with a greedy algorithm that start visiting the first node in the phenotype. From its neighbours, the algorithm discards those nodes that have been visited and select one, randomly, from the remaining nodes taking into account that nodes belonging to the phenotype have more probability to be selected.

slide-16
SLIDE 16

The GA approach

 The generation process is composed by two operators:

 Crossover. This operator is used to generate new individuals based on the parents phenotype. For each parent a random crossover-point is selected and their corresponding parts are interchanged. In the case that a new individual has to inherit the same gene from both parents, only one gene is inherited by the children.  Mutation. This second operator is used to scape from local

  • ptimum. In this case the value of a gene is change

randomly.

slide-17
SLIDE 17

Experiments

slide-18
SLIDE 18

Network used

 ACO and GA have been executed in Random graph and Small World Graph.

 Random Graph. In this type of graph the creation of edges depends on a probability (p).  In Small World Graph there is a connectivity degree (k) and a redirection probability (p).

 Random graph is selected because it is easy to implement and to test our software.  Small World Graph is studied because it is the most commonly used in communication network due to its characteristics.

slide-19
SLIDE 19

Discovering the Connected Components

 Goal: to compare which algorithm discovers a greater number of different connected components.  Each experiment is executed in graphs composed by 10 nodes during 10 generations, and it has been repeated 10 times.  Small World networks have a connectivity degree of 1 (i.e. each node have 2 output connections)

slide-20
SLIDE 20

Discovering the Connected Components

Conclusions

Graph Pop. Size p # Comp (ACO) # Comp (GA) Small World 5 0.1 21 87 0.9 25 124 10 0.1 14 114 0.9 18 163 Rando m 5 0.1 10 25 0.9 39 549 10 0.1 10 18 0.9 74 1093

 GA finds more different connected components but this is because GA does not take into account the order

  • f appearance.

 This is (Na, Nb, Nc) is different to (Nb, Na, Nc)  In the following exp. only ACO is used, because this algorithm does not show this behaviour

slide-21
SLIDE 21

Graphical result

Initial Network Connected Components

slide-22
SLIDE 22

Influence of the number of edges

Experimental Set-up Results

Graph Small World # Nodes 100

  • Redirec. Prob. 0.15

# Ants 50 # Iterations 100  The higher number of edges, more steps are needed to find the connected components.

Conclusion

slide-23
SLIDE 23

Influence of the number of steps

Experimental Set-up Results

Graph Small World # Nodes 1000

  • Connect. Degree

50

  • Redirec. Prob.

0.15 # Ants 100

Conclusion

 As ants do not transmit information about their path, the #steps must be equal or greater than the #nodes.

slide-24
SLIDE 24

Problems observed

slide-25
SLIDE 25

Problems observed

 Genetic Algorithm needs some grouping algorithms to discover partial solutions included into bigger ones.  Genetic Algorithm does not have any mechanism to ensure that good phenotype blocks will be transmitted.  Ant Colony Optimization discover connected components without branches that is because ants do not have direct communication between each other.

P1 P1 P1 P1 P2 P2 P2 P2

slide-26
SLIDE 26

Why we have done this work?

Thanks for your attention… But…. that is all?

slide-27
SLIDE 27

Why we have done this work?

 We have applied a classic Genetic Algorithm and a classic Ant Colony Optimization.

 But very few new has been contributed!

 The application domain is sub-graph detection.

 Yes really, but there is (maybe) millions of algorithms, models, techniques and tools to study this problem!

slide-28
SLIDE 28

Why we have done this work?

 This was the initial step in a larger research whose main goal is to apply Collective Intelligence algorithms to Constraint Satisfaction Problems

slide-29
SLIDE 29

CSP and ACO (Khan et al 2009)

 Khan et al. 2009 solves n-queens problem with ACO.  For n-queens problem, they modelled a graph composed by n layers and each layer has n2 nodes.  This means that the layers represents the queens, and each layer represent the whole board

slide-30
SLIDE 30

CSP and ACO (Khan et al 2009)

 Ants travel from a node in layer X to a node in layer (X+1) indicating in which square each queen is located.

slide-31
SLIDE 31

CSP and ACO (Solnon 2002)

 In this case, the graph is full connected and each node represent the tuple <variable, value>  With this approach, we have n queens with 2 variables that defines the queen coordinates, and each variable can have n different values

slide-32
SLIDE 32

CSP and ACO (Solnon 2002)

3-Queens 5-Queens 4-Queens 8-Queens

slide-33
SLIDE 33

Our approach

 Our approach is based on a graph where each node represent each element in the problem. In the case of the N-queens problem, the graph will have only N nodes.  Each node will have all the variables that involves the element (i.e. in the case of N-queens problem, each node will have only 2 variables that represent the position

  • f the queen)

 The number of edges depends on the problem because two nodes will be connected if there is at least one restriction that involves any variable of the two elements.

slide-34
SLIDE 34

Our approach

 With our approach ants not only navigates trough the network indicating that the ant is located in this square as Khan and Solnon does, but our ants assign different values to variables contained in the nodes taking into account the restriction stored in the edges.  With this approach the network is drastically reduced and the system is scalable.

slide-35
SLIDE 35

Our approach (8-queens)

Khan et. al. 2009 Solnon 2002 Our approach

slide-36
SLIDE 36

Our approach

Queens Khan et. al. 2009 # Nodes # Edges Solnon 2002 # Nodes # Edges Our approach # Nodes # Edges 8 512 28672 128 16256 8 56 10 1000 90000 200 39800 10 90 50 125000 306250000 5000 24995000 50 2450 100 1000000 9900000000 20000 399980000 100 9900 1000 1000000000 9.99E+14 2000000 4E+12 1000 999000 5000 1.25E+11 3.1244E+18 50000000 2.5E+15 5000 24995000

slide-37
SLIDE 37

And currently....we

 We have a fully version based on ACO, to model basic CSP problems, with very “friendly and smarty” ants  We test this approach with other computational intelligence algorithm such as bacterial foraging or bee colony algorithm.  We have some promising results for classical CSP problems (i.e. NP- Queens)  Those results shows to be similar (in performance) to classical CSP approaches  And of course! It is fully and absolutely distributed

slide-38
SLIDE 38

Applied Intelligence & Data Analysis http://aida.ii.uam.es Universidad Autónoma de Madrid

Maximal Component detection in graphs using swarm-based and genetic algorithm

A question, but this time to the audience…. Why our Swarm model looks to be so good?

David Camacho david.camacho@uam.es