www.gc.cuny.edu
Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: - - PowerPoint PPT Presentation
Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: - - PowerPoint PPT Presentation
Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: Professor Feng Gu www.gc.cuny.edu Agenda 1. Traveling salesman introduction 2. Genetic Algorithm for TSP 3. Tree Search for TSP www.gc.cuny.edu www.gc.cuny.edu Travelling
www.gc.cuny.edu
Agenda
- 1. Traveling salesman introduction
- 2. Genetic Algorithm for TSP
- 3. Tree Search for TSP
www.gc.cuny.edu
www.gc.cuny.edu
Travelling Salesman
- Set of N cities
- Find the shortest closed non-looping
path that covers all the cities
- No city be visited more than once
www.gc.cuny.edu
Travelling Salesman First Parallel Approach: Genetic Algorithm
www.gc.cuny.edu
Travelling Salesman -Sequential Genetic Algorithm
Initialization 0123, 0231,1320, 0321 Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Cross-over Mutation Termination
Sequential GA Travelling Salesman
- Individuals
- Closed non-looping paths across all cities
- Initial
Population
- Set of randomly generated paths
- Evaluation
- Assess the fitness of the individual. Fitness is 1/
total distance of a given path
- Selection
- Select the fittest individuals ( biggest fitness,
smallest distance)
- Offspring
production
- Cross-over + mutation
Selection
Roulette Wheel Selection: P(choice – (0123)) = 0.05/(0.05+ 0.045 + 0.05 + 0.023) = 0.3 P(choice – (0231) = 0.27 P(choice – (1320)) = 0.3 P(choice – (0321)) = 0.13 If random number fall in: 0 ≤ 𝑠 < 0.3 choose 0123 0.3 ≤ 𝑠 < 0.57 choose 0231 0.57 ≤ 𝑠 < 0.87 choose 1320 0.87 ≤ 𝑠 < 1 choose 0321
1st path 2nd path 3rd path 4th path
Specialized Crossover Operator
Order Crossover(OX): No invalid part Normal Crossover: Invalid path appear
Mutation
- Select 2 random point and swap
- Ensure the valid path
Sequential Genetic Algorithm
Parallel Genetic Algorithm
Master: Initialization 0123, 0231,1320, 0321 Fitness Evaluation (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Cross-over Mutation Termination Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 Selection Cross-over Mutation Termination Slave 1 Slave 2
Parallel Travelling Salesman - Master
- Master
- Initializes population
- Sends path to slaves
- Examine the best paths from slaves’ return results
- Slave
- Signals the master that it is ready for work
- Waits for paths to be sent by the master until a termination
message is received
- Evaluates the paths fitness
- Selection
- Crossover
- Mutation
- Sends the best c paths to nearby neighbors after k generations
- When finish, send best paths and their lengths to master
Time Complexity Sequential
Time Complexity: n :population size l : length of a path, number of cities g : number of generation Sequential Genetic Algorithm: Initialization : O(n) Evaluation: O(nl) Selection: O(nl) Crossover: c1x O(nl) Mutation: c2 x O(nl) Time: O(nl) + gO(nl) =O(gnl)
Time Complexity Parallel
- Isolated subpopulations
- Stepping model model: only send best individuals to neighbor
processor
- Communication time
Master send data to slave using scatter: tcomm1= O(nl/p) Slave send best c paths to neighbor processor after k generations: tcomm2= g/kO(cl) = g/kO(l) Slave send their c best paths and their length value to Master: tcomm3= O(cl)
- Computation time
Master Initialization : tcomp1 =O(n) Slave evaluation, selection, crossover, mutation: tcomp2 = O(gnl/p) Master final evaluation : tcomp3 =O(pc)
- Parallel time : tp = O(gnl/p)
- Speed up = ts/tp = p
- Efficiency = ts/ptp =1
www.gc.cuny.edu
Travelling Salesman Second Parallel Approach: Tree Search
www.gc.cuny.edu
Travelling Salesman
www.gc.cuny.edu
Travelling Salesman – Tree Search
www.gc.cuny.edu
Travelling Salesman Sequential
Algorithm
www.gc.cuny.edu
Travelling Salesman Sequential
Algorithm
- City count: examines the partial tour if there are n
cities on the partial tour.
- Best tour: check if the complete tour has a lower
cost than “best tour”
- Update best tour: replace the current best tour
with this tour
- Feasible: checks to see if the city or vertex has
already been visited.
www.gc.cuny.edu
Travelling Salesman Sequential
www.gc.cuny.edu
Travelling Salesman Sequential
www.gc.cuny.edu
Travelling Salesman Sequential
www.gc.cuny.edu
Travelling Salesman Parallel
Static load balancing (picture) à Imbalance load Solution à Dynamic load balancing
www.gc.cuny.edu
Travelling Salesman Parallel
www.gc.cuny.edu
Travelling Salesman Parallel
Terminologies
- Donor process: the process that sends work
- Recipient process: the process that requests/receives work
- Half-split: ideally, the stack is split into two equal pieces such
that the search space of each stack is the same
- Cutoff depth: to avoid sending very small amounts of work,
nodes beyond a specified stack depth are not given away
www.gc.cuny.edu
Travelling Salesman Parallel
Some possible strategies
- 1. Send nodes near the bottom of the stack
- Works well with uniform search space; has low splitting cost
- 2. Send nodes near the cutoff depth
- Performs better with a strong heuristic (tries to distribute the
parts of the search space likely to contain a solution)
- 3. Send half the nodes between the bottom and the cutoff
depth
- Works well with uniform and irregular search space
www.gc.cuny.edu
Travelling Salesman Parallel
www.gc.cuny.edu
Travelling Salesman Parallel
The entire space is assigned to master
- When slave runs out of work, it gets more work from
another slave using work requests and responses
- Unexplored states can be conveniently stored as local stacks
at processors.
- Slave terminate when reaching final state
www.gc.cuny.edu
Travelling Salesman Parallel
- Load balancing scheme: Random polling (RP)
When a processor becomes idle, it randomly selects a donor. Each processor is selected as a donor with equal probability, ensuring that work requests are evenly distributed.
www.gc.cuny.edu
Travelling Salesman Parallel
- Let W be serial work and pWp be parallel work.
- Search overhead factor s is defined as pWP/W
- Quantify total overhead To in terms of W to compute
scalability. § To = pWp – W
- Upper bound on speed up is p×1/s.
www.gc.cuny.edu
Travelling Salesman Parallel
Assumption:
- Search overhead factor = one
- Work at any processor can be partitioned into independent pieces
as long as its size exceeds a threshold ε.
- A reasonable work-splitting mechanism is available.
§ If work w at a processor is split into two parts ψw and (1–ψ)w, there exists an arbitrarily small constant α (0 < α ≤ 0.5),such that ψw > αw and (1–ψ)w > αw. § The constant α sets a lower bound on the load imbalance from work splitting.
www.gc.cuny.edu
Travelling Salesman Parallel
- If processor Pi initially had work wi, after a single request by processor Pj
and split, neither Pi nor Pj have more than (1–α)wi work.
- For each load balancing strategy, we define V(P) as the total number of
work requests after which each processor receives at least one work request (note that V(p) ≥ p).
- Assume that the largest piece of work at any point is W.
- After V(p) requests, the maximum work remaining at any processor is less
than (1–α)W; after 2V(p) requests, it is less than (1–α)2W; …
- After (log1/1(1- α )(W/ε))V(p) requests, the maximum work remaining at any
processor is below a threshold value ε.
- The total number of work requests is O(V(p) log W).
www.gc.cuny.edu
Travelling Salesman Parallel
- If tcomm is the time required to communicate a piece of work, then
the communication overhead TO is TO = tcommV(p)log W
- The corresponding efficiency E is given by:
www.gc.cuny.edu
Travelling Salesman Parallel
- Random Polling
§ Worst case V(p) is unbounded. § We do average case analysis.
- Let F(i,p) represent a state in which i of the processors have been
requested, and p–i have not.
- Let f(i,p) denote the average number of trials needed to change from
state F(i,p) to F(p,p) (V(p) = f(0,p)).
www.gc.cuny.edu
Travelling Salesman Parallel
- We have
As p becomes large, Hp ≃ 1.69 ln p. Thus, V(p) = O(p log p). To = O(p log p log W) Therefore W = O(p log2p).
www.gc.cuny.edu