L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I - - PowerPoint PPT Presentation

l ecture 19 s warm i ntelligence 5 a nt c olony o
SMART_READER_LITE
LIVE PREVIEW

L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I - - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE - S18 L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I NSTRUCTOR : G IANNI A. D I C ARO SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING Nest Nest t = 0 t = 1 Food Food Pheromone Intensity


slide-1
SLIDE 1

15-382 COLLECTIVE INTELLIGENCE - S18

LECTURE 19: SWARM INTELLIGENCE 5 / ANT COLONY OPTIMIZATION 1

INSTRUCTOR: GIANNI A. DI CARO

slide-2
SLIDE 2

2

SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING

Nest Food

t = 0 t = 1

Nest Food Food Nest

t = 2 t = 3

Nest Food

Pheromone Intensity Scale

#Pheromone on a branch ∝ Frequency of fw/bw crossing ∝ Length (quality) of paths

slide-3
SLIDE 3

3

LET’S ABSTRACT A MORE COMPLEX SCENARIO

Food Nest

Pheromone Intensity Scale

Source Target

  • Multiple decision nodes: n decision states/nodes, 𝒚1, 𝒚2, …, 𝒚n ∈ 𝒀
  • Set 𝑩 of decisions / actions, 𝒃1, 𝒃2, …𝒃m, such that at each state 𝒚 a subset 𝒝(𝒚) of

actions is available or feasible

  • A path (ant solution) is constructed through a sequence decisions, for each visited state
  • Multiple ants iterating path construction (i.e., foraging) in parallel
  • A traveling cost is associated to each state transition: colony’s goal is to let the ants

moving over the minimum-cost path between nest and food

𝒚1 𝒚2 𝒚3 𝒚4 𝒚5 𝒚7 𝒚9 𝒚8 𝒚6

𝒃11 𝒃12 𝒃13 𝒃31 𝒃32 𝒃21 𝒃22

slide-4
SLIDE 4

4

LET’S ABSTRACT A MORE COMPLEX SCENARIO

  • Distributed Optimization Problem
  • At each state 𝒚k only local information / constraints (+ some ant memory) is available for

taking (a possibly optimized) decision 𝒃 ∈ 𝒝(𝒚k)

  • Pheromone information (dynamic), parametrized as a vector 𝜐k (stigmergic variables)
  • Heuristic information (static, scenario-related) parametrized as a vector 𝜃k
  • Ant behavior: Stochastic decision policy 𝜌ɛ(𝒚k; 𝜐k, 𝜃k), 𝜌ɛ : 𝒀 ⟼ 𝑩

π ( τ η) ,

π τ η

Decision Rule Stochastic Morphology Terrain Pheromone

???

Destination Source

1 4 3 8 9 5 7 2

τ ;η14

13

τ ;η13

12

τ ;η12 τ ;η τ ;η58

14 59 59 58

6

Pheromone Intensity Scale

How ant colonies solve the Distributed MCP problem? Exploiting pheromone for learning the best (parameters) of the decision policy

slide-5
SLIDE 5

5

A N T C O L O N I E S : I N G R E D I E N T S F O R S H O R T E S T PA T H S

  • A number of concurrent autonomous (simple?) agents (ants)
  • Forward-backward constructive path sampling based on the stochastic policy 𝜌ɛ
  • Local laying and sensing of pheromone → Pheromone is dynamically updated
  • Step-by-step stochastic decisions biased by local pheromone intensity and by other

local heuristic aspects (e.g., terrain)

  • Multiple paths are concurrently tried out and implicitly evaluated
  • Positive feedback effect (local reinforcement of good decisions)
  • Iteration over time of the path sampling actions
  • Persistence (exploitation) and evaporation (exploration) of pheromone

Destination Source

1 4 3 8 9 5 7 2

τ ;η14

13

τ ;η13

12

τ ;η12 τ ;η τ ;η58

14 59 59 58

6

Pheromone Intensity Scale

Forward Backward

slide-6
SLIDE 6

6

FROM ANTS TO ACO:

  • Let’s mimic ant colonies, with some pragmatic modification ….
  • Once completed a solution / path:
  • The sampled solution is evaluated (e.g., sum of the individual costs)
  • “Credit” is assigned to each individual decision belonging to the solution
  • Pheromone updating: the value of the pheromone variables 𝜐k associated to each

decision in the solution are modified according to the “credit”

  • Pheromone values can also decade/change for other reasons (e.g., evaporation)
  • Pheromone values locally encode how good is to take decision i vs. j as collectively

estimated/learned by the agent population through repeated solution sampling

Destination Source

1 4 3 8 9 5 7 2

τ ;η14

13

τ ;η13

12

τ ;η12 τ ;η τ ;η58

14 59 59 58

6

Pheromone Intensity Scale

Paths

π

τ

Pheromone distribution biases path construction Outcomes of path construction modify pheromone distribution

slide-7
SLIDE 7

7

ANT COLONY OPTIMIZATION METAHEURISTIC: (VERY) GENERAL ARCHITECTURE

  • Solution construction
  • Monte Carlo path sampling by N (# states) joint probability distributions

parametrized by 𝜐 and 𝜃 variable arrays

  • Sequential learning by Generalized Policy Iteration (GPI)
slide-8
SLIDE 8

8

PHEROMONE AND HEURISTIC ARRAYS

Destination Source

1 4 3 8 9 5 7 2

τ ;η14

13

τ ;η13

12

τ ;η12 τ ;η τ ;η58

14 59 59 58

6

Pheromone Intensity Scale

slide-9
SLIDE 9

9

ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP)

17 12 11 8 16 19 5 9 21 3 11 10 11 10

1 5 2 4 3 6 7

Given G(V, E) find the Hamiltionian tour of minimal cost : NP-Hard Every cyclic permutation of n integers is a feasible solution It’s easier to consider fully connected graphs, |E| = |V| |V-1|: If two nodes are not connect, d is infinite π1 = (1, 3, 4, 2, 6, 5, 7, 1), π2 = (2, 3, 4, 5, 6, 7, 1, 2) c(π2) = d23 + d34 + d45 + d56 + d67 + d71 + d12 = 93 Read also as set of edges: {(2,3), (3,4), (4,5), (6,7), (7,1), (1,2)} “Related” combinatorial optimization problems : VRPs, SOP , TO, QAP , …

slide-10
SLIDE 10

10

ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP)

17 12 11 8 16 19 5 9 21 3 11 10 11 10

1 5 2 4 3 6 7

  • Pheromone variables: 𝜐ij ∈ ℝ+ expresses how beneficial is (estimated, up to now) to

have edge (i,j) in the solution to optimize final tour length → |E| variables

  • Heuristic values 𝜃ij ∈ ℝ+: problem costs cij ∈ ℝ+ for traveling from i to j → |E| variables
  • Extension: when ant k is in city i, how good is expected to include (feasible) city j

(next in the solution sequence xk(t)? → f (𝜐ij, 𝜃ij)

  • Insertion: how good is expected to insert (feasible) edge (m,p) in the partial

solution xk(t)? → f (𝜐mp 𝜃mp) Solution construction strategies (no repair, no look-ahed)

slide-11
SLIDE 11

11

(META-)ACO FOR CO PROBLEMS (CENTRALIZED SCHEDULE)

Initialize j(0) to small random values and let t = 0; repeat Place nk ants on randomly chosen origin nodes; foreach ant k = 1, . . . , nk do Construct a tour k(t) [Update pheromone step-by-step]; Evaluate tour k(t); end foreach [selected] edge (, j) of the graph do Pheromone evaporation; end foreach [selected] ant k = 1, . . . , nk do foreach [selected] edge (, j) of k(t) do Update j using tour evaluation results; end end Daemon actions [Local search]; t = t + 1; until stopping condition is true; return best solution generated; path path path