L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I - - PowerPoint PPT Presentation
L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I - - PowerPoint PPT Presentation
15-382 C OLLECTIVE I NTELLIGENCE - S18 L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I NSTRUCTOR : G IANNI A. D I C ARO SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING Nest Nest t = 0 t = 1 Food Food Pheromone Intensity
2
SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING
Nest Food
t = 0 t = 1
Nest Food Food Nest
t = 2 t = 3
Nest Food
Pheromone Intensity Scale
#Pheromone on a branch ∝ Frequency of fw/bw crossing ∝ Length (quality) of paths
3
LET’S ABSTRACT A MORE COMPLEX SCENARIO
Food Nest
Pheromone Intensity Scale
Source Target
- Multiple decision nodes: n decision states/nodes, 𝒚1, 𝒚2, …, 𝒚n ∈ 𝒀
- Set 𝑩 of decisions / actions, 𝒃1, 𝒃2, …𝒃m, such that at each state 𝒚 a subset (𝒚) of
actions is available or feasible
- A path (ant solution) is constructed through a sequence decisions, for each visited state
- Multiple ants iterating path construction (i.e., foraging) in parallel
- A traveling cost is associated to each state transition: colony’s goal is to let the ants
moving over the minimum-cost path between nest and food
𝒚1 𝒚2 𝒚3 𝒚4 𝒚5 𝒚7 𝒚9 𝒚8 𝒚6
𝒃11 𝒃12 𝒃13 𝒃31 𝒃32 𝒃21 𝒃22
4
LET’S ABSTRACT A MORE COMPLEX SCENARIO
- Distributed Optimization Problem
- At each state 𝒚k only local information / constraints (+ some ant memory) is available for
taking (a possibly optimized) decision 𝒃 ∈ (𝒚k)
- Pheromone information (dynamic), parametrized as a vector 𝜐k (stigmergic variables)
- Heuristic information (static, scenario-related) parametrized as a vector 𝜃k
- Ant behavior: Stochastic decision policy 𝜌ɛ(𝒚k; 𝜐k, 𝜃k), 𝜌ɛ : 𝒀 ⟼ 𝑩
π ( τ η) ,
π τ η
Decision Rule Stochastic Morphology Terrain Pheromone
???
Destination Source
1 4 3 8 9 5 7 2
τ ;η14
13
τ ;η13
12
τ ;η12 τ ;η τ ;η58
14 59 59 58
6
Pheromone Intensity Scale
How ant colonies solve the Distributed MCP problem? Exploiting pheromone for learning the best (parameters) of the decision policy
5
A N T C O L O N I E S : I N G R E D I E N T S F O R S H O R T E S T PA T H S
- A number of concurrent autonomous (simple?) agents (ants)
- Forward-backward constructive path sampling based on the stochastic policy 𝜌ɛ
- Local laying and sensing of pheromone → Pheromone is dynamically updated
- Step-by-step stochastic decisions biased by local pheromone intensity and by other
local heuristic aspects (e.g., terrain)
- Multiple paths are concurrently tried out and implicitly evaluated
- Positive feedback effect (local reinforcement of good decisions)
- Iteration over time of the path sampling actions
- Persistence (exploitation) and evaporation (exploration) of pheromone
Destination Source
1 4 3 8 9 5 7 2
τ ;η14
13
τ ;η13
12
τ ;η12 τ ;η τ ;η58
14 59 59 58
6
Pheromone Intensity Scale
Forward Backward
6
FROM ANTS TO ACO:
- Let’s mimic ant colonies, with some pragmatic modification ….
- Once completed a solution / path:
- The sampled solution is evaluated (e.g., sum of the individual costs)
- “Credit” is assigned to each individual decision belonging to the solution
- Pheromone updating: the value of the pheromone variables 𝜐k associated to each
decision in the solution are modified according to the “credit”
- Pheromone values can also decade/change for other reasons (e.g., evaporation)
- Pheromone values locally encode how good is to take decision i vs. j as collectively
estimated/learned by the agent population through repeated solution sampling
Destination Source
1 4 3 8 9 5 7 2
τ ;η14
13
τ ;η13
12
τ ;η12 τ ;η τ ;η58
14 59 59 58
6
Pheromone Intensity Scale
Paths
π
τ
Pheromone distribution biases path construction Outcomes of path construction modify pheromone distribution
7
ANT COLONY OPTIMIZATION METAHEURISTIC: (VERY) GENERAL ARCHITECTURE
- Solution construction
- Monte Carlo path sampling by N (# states) joint probability distributions
parametrized by 𝜐 and 𝜃 variable arrays
- Sequential learning by Generalized Policy Iteration (GPI)
8
PHEROMONE AND HEURISTIC ARRAYS
Destination Source
1 4 3 8 9 5 7 2
τ ;η14
13
τ ;η13
12
τ ;η12 τ ;η τ ;η58
14 59 59 58
6
Pheromone Intensity Scale
9
ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP)
17 12 11 8 16 19 5 9 21 3 11 10 11 10
1 5 2 4 3 6 7
Given G(V, E) find the Hamiltionian tour of minimal cost : NP-Hard Every cyclic permutation of n integers is a feasible solution It’s easier to consider fully connected graphs, |E| = |V| |V-1|: If two nodes are not connect, d is infinite π1 = (1, 3, 4, 2, 6, 5, 7, 1), π2 = (2, 3, 4, 5, 6, 7, 1, 2) c(π2) = d23 + d34 + d45 + d56 + d67 + d71 + d12 = 93 Read also as set of edges: {(2,3), (3,4), (4,5), (6,7), (7,1), (1,2)} “Related” combinatorial optimization problems : VRPs, SOP , TO, QAP , …
10
ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP)
17 12 11 8 16 19 5 9 21 3 11 10 11 10
1 5 2 4 3 6 7
- Pheromone variables: 𝜐ij ∈ ℝ+ expresses how beneficial is (estimated, up to now) to
have edge (i,j) in the solution to optimize final tour length → |E| variables
- Heuristic values 𝜃ij ∈ ℝ+: problem costs cij ∈ ℝ+ for traveling from i to j → |E| variables
- Extension: when ant k is in city i, how good is expected to include (feasible) city j
(next in the solution sequence xk(t)? → f (𝜐ij, 𝜃ij)
- Insertion: how good is expected to insert (feasible) edge (m,p) in the partial
solution xk(t)? → f (𝜐mp 𝜃mp) Solution construction strategies (no repair, no look-ahed)
11
(META-)ACO FOR CO PROBLEMS (CENTRALIZED SCHEDULE)
Initialize j(0) to small random values and let t = 0; repeat Place nk ants on randomly chosen origin nodes; foreach ant k = 1, . . . , nk do Construct a tour k(t) [Update pheromone step-by-step]; Evaluate tour k(t); end foreach [selected] edge (, j) of the graph do Pheromone evaporation; end foreach [selected] ant k = 1, . . . , nk do foreach [selected] edge (, j) of k(t) do Update j using tour evaluation results; end end Daemon actions [Local search]; t = t + 1; until stopping condition is true; return best solution generated; path path path