SLIDE 1 Applied Intelligence & Data Analysis http://aida.ii.uam.es Universidad Autónoma de Madrid
Maximal Component detection in graphs using swarm-based and genetic algorithm
Antonio Gonzalez-Pardo, David Camacho
David Camacho david.camacho@uam.es
SLIDE 2 Outline
Introduction Goals Bio-inspired approaches applied to subgraphs
The ACO approach The GA approach
Experiments Problems observed Why we have done this work?
SLIDE 3
Introduction
SLIDE 4
Introduction
Past decades have shown a growing interest on Collective Intelligence and Evolutionary Algorithm. Collective Intelligence (CI) uses concepts extracted from the social behavior observed in sets of self-organizing communities like ants, bees or bacterias amongst others. Evolutionary Algorithms (EA) are based on the evolution that allows the generation of new individuals based on the characteristics of their parents.
SLIDE 5 Introduction
Both algorithms have some similarities:
Are based on heuristic search. The population explores the solution space of the problem modelled. Are usually applied to NP-complete or NP-hard problems.
Some problems where EA and CI are applied:
Routing problems Constraint Satisfaction Problems Scheduling problems.
SLIDE 6
Goals
SLIDE 7
Goals
1. In this work we have made an experimental comparison between a classical Genetic-based approach and a Swarm-based strategy applied to the detection of the maximal connected component of a graph. 2. The maximal connected component is composed by the maximum number of nodes of a graph in such a way that from any node there exist a path to any other node from the same set.
SLIDE 8
Bio-inspired approaches applied to subgraphs detection
The ACO approach
SLIDE 9
The ACO approach
In this approach, ants (agents) travel through the network trying to visit all the nodes. Each ant starts in a random node. Initially, neighbours of the current node have the same probability to be selected but this probability change during the execution of the system due to the pheromone values deposited by the ants.
SLIDE 10
The ACO approach
Once any ant decides what will be the next node to visit, the ant put a pheromone in the graph. Pheromones are sensed by ants while travelling through the network and attract ants to follow trails with high pheromone concentrations (values). Also, there is an evaporation process that reduces the different pheromone concentrations in the graph. This process is very useful to forget bad decisions taken previously.
SLIDE 11
The ACO approach
Given an ant located in node i, the probability of travelling from node i to node j is: Where: is the pheromone concentration between nodes i and j is 0 if node j has just been visited by the ant. is the neighborhood of node i
SLIDE 12
The ACO approach
SLIDE 13
Bio-inspired approaches applied to subgraphs detection
The GA approach
SLIDE 14
The GA approach
In this approach, individuals (agents) represents possible solution to the problem. Individuals contain a phenotype that contains a set of genes with different nodes names. Both, the length of the phenotype and the gene values are randomly selected. The phenotype represents a possible path that will be evaluated against the graph. This evaluation provides a fitness value and it will be used in the generation of the next population.
SLIDE 15
The GA approach
The fitness function is the same as the one used in ACO to determine the goodness of a path: The validation process is perform with a greedy algorithm that start visiting the first node in the phenotype. From its neighbours, the algorithm discards those nodes that have been visited and select one, randomly, from the remaining nodes taking into account that nodes belonging to the phenotype have more probability to be selected.
SLIDE 16 The GA approach
The generation process is composed by two operators:
Crossover. This operator is used to generate new individuals based on the parents phenotype. For each parent a random crossover-point is selected and their corresponding parts are interchanged. In the case that a new individual has to inherit the same gene from both parents, only one gene is inherited by the children. Mutation. This second operator is used to scape from local
- ptimum. In this case the value of a gene is change
randomly.
SLIDE 17
Experiments
SLIDE 18 Network used
ACO and GA have been executed in Random graph and Small World Graph.
Random Graph. In this type of graph the creation of edges depends on a probability (p). In Small World Graph there is a connectivity degree (k) and a redirection probability (p).
Random graph is selected because it is easy to implement and to test our software. Small World Graph is studied because it is the most commonly used in communication network due to its characteristics.
SLIDE 19
Discovering the Connected Components
Goal: to compare which algorithm discovers a greater number of different connected components. Each experiment is executed in graphs composed by 10 nodes during 10 generations, and it has been repeated 10 times. Small World networks have a connectivity degree of 1 (i.e. each node have 2 output connections)
SLIDE 20 Discovering the Connected Components
Conclusions
Graph Pop. Size p # Comp (ACO) # Comp (GA) Small World 5 0.1 21 87 0.9 25 124 10 0.1 14 114 0.9 18 163 Rando m 5 0.1 10 25 0.9 39 549 10 0.1 10 18 0.9 74 1093
GA finds more different connected components but this is because GA does not take into account the order
This is (Na, Nb, Nc) is different to (Nb, Na, Nc) In the following exp. only ACO is used, because this algorithm does not show this behaviour
SLIDE 21
Graphical result
Initial Network Connected Components
SLIDE 22 Influence of the number of edges
Experimental Set-up Results
Graph Small World # Nodes 100
# Ants 50 # Iterations 100 The higher number of edges, more steps are needed to find the connected components.
Conclusion
SLIDE 23 Influence of the number of steps
Experimental Set-up Results
Graph Small World # Nodes 1000
50
0.15 # Ants 100
Conclusion
As ants do not transmit information about their path, the #steps must be equal or greater than the #nodes.
SLIDE 24
Problems observed
SLIDE 25 Problems observed
Genetic Algorithm needs some grouping algorithms to discover partial solutions included into bigger ones. Genetic Algorithm does not have any mechanism to ensure that good phenotype blocks will be transmitted. Ant Colony Optimization discover connected components without branches that is because ants do not have direct communication between each other.
P1 P1 P1 P1 P2 P2 P2 P2
SLIDE 26
Why we have done this work?
Thanks for your attention… But…. that is all?
SLIDE 27 Why we have done this work?
We have applied a classic Genetic Algorithm and a classic Ant Colony Optimization.
But very few new has been contributed!
The application domain is sub-graph detection.
Yes really, but there is (maybe) millions of algorithms, models, techniques and tools to study this problem!
SLIDE 28
Why we have done this work?
This was the initial step in a larger research whose main goal is to apply Collective Intelligence algorithms to Constraint Satisfaction Problems
SLIDE 29
CSP and ACO (Khan et al 2009)
Khan et al. 2009 solves n-queens problem with ACO. For n-queens problem, they modelled a graph composed by n layers and each layer has n2 nodes. This means that the layers represents the queens, and each layer represent the whole board
SLIDE 30 CSP and ACO (Khan et al 2009)
Ants travel from a node in layer X to a node in layer (X+1) indicating in which square each queen is located.
SLIDE 31
CSP and ACO (Solnon 2002)
In this case, the graph is full connected and each node represent the tuple <variable, value> With this approach, we have n queens with 2 variables that defines the queen coordinates, and each variable can have n different values
SLIDE 32
CSP and ACO (Solnon 2002)
3-Queens 5-Queens 4-Queens 8-Queens
SLIDE 33 Our approach
Our approach is based on a graph where each node represent each element in the problem. In the case of the N-queens problem, the graph will have only N nodes. Each node will have all the variables that involves the element (i.e. in the case of N-queens problem, each node will have only 2 variables that represent the position
The number of edges depends on the problem because two nodes will be connected if there is at least one restriction that involves any variable of the two elements.
SLIDE 34
Our approach
With our approach ants not only navigates trough the network indicating that the ant is located in this square as Khan and Solnon does, but our ants assign different values to variables contained in the nodes taking into account the restriction stored in the edges. With this approach the network is drastically reduced and the system is scalable.
SLIDE 35 Our approach (8-queens)
Khan et. al. 2009 Solnon 2002 Our approach
SLIDE 36 Our approach
Queens Khan et. al. 2009 # Nodes # Edges Solnon 2002 # Nodes # Edges Our approach # Nodes # Edges 8 512 28672 128 16256 8 56 10 1000 90000 200 39800 10 90 50 125000 306250000 5000 24995000 50 2450 100 1000000 9900000000 20000 399980000 100 9900 1000 1000000000 9.99E+14 2000000 4E+12 1000 999000 5000 1.25E+11 3.1244E+18 50000000 2.5E+15 5000 24995000
SLIDE 37
And currently....we
We have a fully version based on ACO, to model basic CSP problems, with very “friendly and smarty” ants We test this approach with other computational intelligence algorithm such as bacterial foraging or bee colony algorithm. We have some promising results for classical CSP problems (i.e. NP- Queens) Those results shows to be similar (in performance) to classical CSP approaches And of course! It is fully and absolutely distributed
SLIDE 38 Applied Intelligence & Data Analysis http://aida.ii.uam.es Universidad Autónoma de Madrid
Maximal Component detection in graphs using swarm-based and genetic algorithm
A question, but this time to the audience…. Why our Swarm model looks to be so good?
David Camacho david.camacho@uam.es