l ecture 19 s warm i ntelligence 5 a nt c olony o
play

L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE - S18 L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I NSTRUCTOR : G IANNI A. D I C ARO SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING Nest Nest t = 0 t = 1 Food Food Pheromone Intensity


  1. 15-382 C OLLECTIVE I NTELLIGENCE - S18 L ECTURE 19: S WARM I NTELLIGENCE 5 / A NT C OLONY O PTIMIZATION 1 I NSTRUCTOR : G IANNI A. D I C ARO

  2. SHORTEST PATHS WITH PHEROMONE LAYING-FOLLOWING Nest Nest t = 0 t = 1 Food Food Pheromone Intensity Scale Nest Nest t = 2 t = 3 Food Food #Pheromone on a branch ∝ Frequency of fw/bw crossing ∝ Length (quality) of paths 2

  3. LET’S ABSTRACT A MORE COMPLEX SCENARIO 𝒚 2 𝒃 22 𝒃 21 𝒃 13 𝒚 7 Nest 𝒚 1 𝒃 12 Source Food 𝒃 11 𝒚 9 Target 𝒃 31 𝒚 3 𝒃 32 𝒚 5 𝒚 8 𝒚 4 Pheromone Intensity Scale 𝒚 6 • Multiple decision nodes: n decision states/nodes, 𝒚 1 , 𝒚 2 , …, 𝒚 n ∈ 𝒀 • Set 𝑩 of decisions / actions, 𝒃 1 , 𝒃 2 , … 𝒃 m, such that at each state 𝒚 a subset 𝒝 ( 𝒚 ) of actions is available or feasible • A path (ant solution ) is constructed through a sequence decisions, for each visited state • Multiple ants iterating path construction (i.e., foraging ) in parallel • A traveling cost is associated to each state transition: colony’s goal is to let the ants moving over the minimum-cost path between nest and food 3

  4. LET’S ABSTRACT A MORE COMPLEX SCENARIO 2 τ ; η 12 η 12 Source 7 Terrain τ ; η 13 1 Morphology 13 Destination τ ; η 14 τ ; η 9 π 14 59 59 3 π ( τ η ) , Stochastic 5 ??? Decision Rule τ ; η 58 58 τ 8 Pheromone 4 Pheromone Intensity Scale 6 • Distributed Optimization Problem • At each state 𝒚 k only local information / constraints (+ some ant memory) is available for taking (a possibly optimized) decision 𝒃 ∈ 𝒝 ( 𝒚 k ) • Pheromone information (dynamic) , parametrized as a vector 𝜐 k (stigmergic variables) • Heuristic information (static, scenario-related) parametrized as a vector 𝜃 k • Ant behavior: Stochastic decision policy 𝜌 ɛ ( 𝒚 k ; 𝜐 k , 𝜃 k ), 𝜌 ɛ : 𝒀 ⟼ 𝑩 How ant colonies solve the Distributed MCP problem? Exploiting pheromone for learning the best (parameters) of the decision policy 4

  5. A N T C O L O N I E S : I N G R E D I E N T S F O R S H O R T E S T PA T H S Forward 2 τ ; η 12 12 Source 7 τ ; η 13 1 13 Destination τ ; η 14 τ ; η 9 14 59 59 3 5 τ ; η 58 58 8 4 Pheromone Intensity Scale Backward 6 • A number of concurrent autonomous (simple?) agents (ants) • Forward-backward constructive path sampling based on the stochastic policy 𝜌 ɛ • Local laying and sensing of pheromone → Pheromone is dynamically updated • Step-by-step stochastic decisions biased by local pheromone intensity and by other local heuristic aspects (e.g., terrain) • Multiple paths are concurrently tried out and implicitly evaluated • Positive feedback e ff ect (local reinforcement of good decisions) • Iteration over time of the path sampling actions • Persistence ( exploitation ) and evaporation ( exploration ) of pheromone 5

  6. FROM ANTS TO ACO: • Let’s mimic ant colonies, with some pragmatic modification …. • Once completed a solution / path: • The sampled solution is evaluated (e.g., sum of the individual costs) • “ Credit ” is assigned to each individual decision belonging to the solution • Pheromone updating: the value of the pheromone variables 𝜐 k associated to each decision in the solution are modified according to the “credit” • Pheromone values can also decade/change for other reasons (e.g., evaporation ) • Pheromone values locally encode how good is to take decision i vs. j as collectively estimated/learned by the agent population through repeated solution sampling Pheromone distribution 2 biases path construction τ ; η 12 π 12 Source 7 τ ; η 13 1 13 Destination τ ; η 14 τ ; η 9 14 59 59 3 Paths τ 5 τ ; η 58 58 Outcomes of path 8 construction modify 4 Pheromone Intensity Scale pheromone distribution 6 6

  7. ANT COLONY OPTIMIZATION METAHEURISTIC: (VERY) GENERAL ARCHITECTURE • Solution construction • Monte Carlo path sampling by N (# states) joint probability distributions parametrized by 𝜐 and 𝜃 variable arrays • Sequential learning by Generalized Policy Iteration (GPI) 7

  8. PHEROMONE AND HEURISTIC ARRAYS 2 τ ; η 12 12 Source 7 τ ; η 13 1 13 Destination τ ; η 14 τ ; η 9 14 59 59 3 5 τ ; η 58 58 8 4 Pheromone Intensity Scale 6 8

  9. ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP) Given G(V, E) find the Hamiltionian tour of minimal cost : NP-Hard 2 Every cyclic permutation of n 12 integers is a feasible solution 8 17 4 11 3 10 1 π 1 = (1 , 3 , 4 , 2 , 6 , 5 , 7 , 1) , π 2 = (2 , 3 , 4 , 5 , 6 , 7 , 1 , 2) 11 3 21 10 c ( π 2 ) = d 23 + d 34 + d 45 + d 56 + d 67 + d 71 + d 12 = 93 5 9 11 16 5 Read also as set of edges: 6 7 19 {(2,3), (3,4), (4,5), (6,7), (7,1), (1,2)} It’s easier to consider fully connected graphs, |E| = |V| |V-1|: If two nodes are not connect, d is infinite “Related” combinatorial optimization problems : VRPs, SOP , TO, QAP , … 9

  10. ACO FOR THE TRAVELING SALESMAN PROBLEM (TSP) • Pheromone variables: 𝜐 ij ∈ ℝ + expresses how beneficial is ( estimated , up to now) to have edge ( i , j ) in the solution to optimize final tour length → |E| variables • Heuristic values 𝜃 ij ∈ ℝ + : problem costs c ij ∈ ℝ + for traveling from i to j → |E| variables 2 12 8 17 4 11 3 10 1 11 3 21 10 5 9 11 16 5 Solution construction strategies (no repair, no look-ahed) 6 7 19 • Extension: when ant k is in city i , how good is expected to include (feasible) city j (next in the solution sequence x k ( t )? → f ( 𝜐 ij, 𝜃 ij) • Insertion: how good is expected to insert (feasible) edge ( m,p) in the partial solution x k ( t )? → f ( 𝜐 mp 𝜃 mp) 10

  11. (META-)ACO FOR CO PROBLEMS (CENTRALIZED SCHEDULE) Initialize � � j ( 0 ) to small random values and let t = 0 ; repeat Place n k ants on randomly chosen origin nodes; foreach ant k = 1 , . . . , n k do Construct a tour � k ( t ) [Update pheromone step-by-step]; path path Evaluate tour � k ( t ) ; end foreach [selected] edge ( � , j ) of the graph do Pheromone evaporation; end foreach [selected] ant k = 1 , . . . , n k do foreach [selected] edge ( � , j ) of � k ( t ) do path Update � � j using tour evaluation results; end end Daemon actions [Local search]; t = t + 1; until stopping condition is true; return best solution generated; 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend