Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: - - PowerPoint PPT Presentation

parallel traveling salesman phd student viet anh trinh
SMART_READER_LITE
LIVE PREVIEW

Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: - - PowerPoint PPT Presentation

Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: Professor Feng Gu www.gc.cuny.edu Agenda 1. Traveling salesman introduction 2. Genetic Algorithm for TSP 3. Tree Search for TSP www.gc.cuny.edu www.gc.cuny.edu Travelling


slide-1
SLIDE 1

www.gc.cuny.edu

Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: Professor Feng Gu

slide-2
SLIDE 2

www.gc.cuny.edu

Agenda

  • 1. Traveling salesman introduction
  • 2. Genetic Algorithm for TSP
  • 3. Tree Search for TSP
slide-3
SLIDE 3

www.gc.cuny.edu

slide-4
SLIDE 4

www.gc.cuny.edu

Travelling Salesman

  • Set of N cities
  • Find the shortest closed non-looping

path that covers all the cities

  • No city be visited more than once
slide-5
SLIDE 5

www.gc.cuny.edu

Travelling Salesman First Parallel Approach: Genetic Algorithm

slide-6
SLIDE 6

www.gc.cuny.edu

Travelling Salesman -Sequential Genetic Algorithm

Initialization 0123, 0231,1320, 0321 Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Cross-over Mutation Termination

slide-7
SLIDE 7

Sequential GA Travelling Salesman

  • Individuals
  • Closed non-looping paths across all cities
  • Initial

Population

  • Set of randomly generated paths
  • Evaluation
  • Assess the fitness of the individual. Fitness is 1/

total distance of a given path

  • Selection
  • Select the fittest individuals ( biggest fitness,

smallest distance)

  • Offspring

production

  • Cross-over + mutation
slide-8
SLIDE 8

Selection

Roulette Wheel Selection: P(choice – (0123)) = 0.05/(0.05+ 0.045 + 0.05 + 0.023) = 0.3 P(choice – (0231) = 0.27 P(choice – (1320)) = 0.3 P(choice – (0321)) = 0.13 If random number fall in: 0 ≤ 𝑠 < 0.3 choose 0123 0.3 ≤ 𝑠 < 0.57 choose 0231 0.57 ≤ 𝑠 < 0.87 choose 1320 0.87 ≤ 𝑠 < 1 choose 0321

1st path 2nd path 3rd path 4th path

slide-9
SLIDE 9

Specialized Crossover Operator

Order Crossover(OX): No invalid part Normal Crossover: Invalid path appear

slide-10
SLIDE 10

Mutation

  • Select 2 random point and swap
  • Ensure the valid path
slide-11
SLIDE 11

Sequential Genetic Algorithm

slide-12
SLIDE 12

Parallel Genetic Algorithm

Master: Initialization 0123, 0231,1320, 0321 Fitness Evaluation (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Cross-over Mutation Termination Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 Selection Cross-over Mutation Termination Slave 1 Slave 2

slide-13
SLIDE 13

Parallel Travelling Salesman - Master

  • Master
  • Initializes population
  • Sends path to slaves
  • Examine the best paths from slaves’ return results
  • Slave
  • Signals the master that it is ready for work
  • Waits for paths to be sent by the master until a termination

message is received

  • Evaluates the paths fitness
  • Selection
  • Crossover
  • Mutation
  • Sends the best c paths to nearby neighbors after k generations
  • When finish, send best paths and their lengths to master
slide-14
SLIDE 14

Time Complexity Sequential

Time Complexity: n :population size l : length of a path, number of cities g : number of generation Sequential Genetic Algorithm: Initialization : O(n) Evaluation: O(nl) Selection: O(nl) Crossover: c1x O(nl) Mutation: c2 x O(nl) Time: O(nl) + gO(nl) =O(gnl)

slide-15
SLIDE 15

Time Complexity Parallel

  • Isolated subpopulations
  • Stepping model model: only send best individuals to neighbor

processor

  • Communication time

Master send data to slave using scatter: tcomm1= O(nl/p) Slave send best c paths to neighbor processor after k generations: tcomm2= g/kO(cl) = g/kO(l) Slave send their c best paths and their length value to Master: tcomm3= O(cl)

  • Computation time

Master Initialization : tcomp1 =O(n) Slave evaluation, selection, crossover, mutation: tcomp2 = O(gnl/p) Master final evaluation : tcomp3 =O(pc)

  • Parallel time : tp = O(gnl/p)
  • Speed up = ts/tp = p
  • Efficiency = ts/ptp =1
slide-16
SLIDE 16

www.gc.cuny.edu

Travelling Salesman Second Parallel Approach: Tree Search

slide-17
SLIDE 17

www.gc.cuny.edu

Travelling Salesman

slide-18
SLIDE 18

www.gc.cuny.edu

Travelling Salesman – Tree Search

slide-19
SLIDE 19

www.gc.cuny.edu

Travelling Salesman Sequential

Algorithm

slide-20
SLIDE 20

www.gc.cuny.edu

Travelling Salesman Sequential

Algorithm

  • City count: examines the partial tour if there are n

cities on the partial tour.

  • Best tour: check if the complete tour has a lower

cost than “best tour”

  • Update best tour: replace the current best tour

with this tour

  • Feasible: checks to see if the city or vertex has

already been visited.

slide-21
SLIDE 21

www.gc.cuny.edu

Travelling Salesman Sequential

slide-22
SLIDE 22

www.gc.cuny.edu

Travelling Salesman Sequential

slide-23
SLIDE 23

www.gc.cuny.edu

Travelling Salesman Sequential

slide-24
SLIDE 24

www.gc.cuny.edu

Travelling Salesman Parallel

Static load balancing (picture) à Imbalance load Solution à Dynamic load balancing

slide-25
SLIDE 25

www.gc.cuny.edu

Travelling Salesman Parallel

slide-26
SLIDE 26

www.gc.cuny.edu

Travelling Salesman Parallel

Terminologies

  • Donor process: the process that sends work
  • Recipient process: the process that requests/receives work
  • Half-split: ideally, the stack is split into two equal pieces such

that the search space of each stack is the same

  • Cutoff depth: to avoid sending very small amounts of work,

nodes beyond a specified stack depth are not given away

slide-27
SLIDE 27

www.gc.cuny.edu

Travelling Salesman Parallel

Some possible strategies

  • 1. Send nodes near the bottom of the stack
  • Works well with uniform search space; has low splitting cost
  • 2. Send nodes near the cutoff depth
  • Performs better with a strong heuristic (tries to distribute the

parts of the search space likely to contain a solution)

  • 3. Send half the nodes between the bottom and the cutoff

depth

  • Works well with uniform and irregular search space
slide-28
SLIDE 28

www.gc.cuny.edu

Travelling Salesman Parallel

slide-29
SLIDE 29

www.gc.cuny.edu

Travelling Salesman Parallel

The entire space is assigned to master

  • When slave runs out of work, it gets more work from

another slave using work requests and responses

  • Unexplored states can be conveniently stored as local stacks

at processors.

  • Slave terminate when reaching final state
slide-30
SLIDE 30

www.gc.cuny.edu

Travelling Salesman Parallel

  • Load balancing scheme: Random polling (RP)

When a processor becomes idle, it randomly selects a donor. Each processor is selected as a donor with equal probability, ensuring that work requests are evenly distributed.

slide-31
SLIDE 31

www.gc.cuny.edu

Travelling Salesman Parallel

  • Let W be serial work and pWp be parallel work.
  • Search overhead factor s is defined as pWP/W
  • Quantify total overhead To in terms of W to compute

scalability. § To = pWp – W

  • Upper bound on speed up is p×1/s.
slide-32
SLIDE 32

www.gc.cuny.edu

Travelling Salesman Parallel

Assumption:

  • Search overhead factor = one
  • Work at any processor can be partitioned into independent pieces

as long as its size exceeds a threshold ε.

  • A reasonable work-splitting mechanism is available.

§ If work w at a processor is split into two parts ψw and (1–ψ)w, there exists an arbitrarily small constant α (0 < α ≤ 0.5),such that ψw > αw and (1–ψ)w > αw. § The constant α sets a lower bound on the load imbalance from work splitting.

slide-33
SLIDE 33

www.gc.cuny.edu

Travelling Salesman Parallel

  • If processor Pi initially had work wi, after a single request by processor Pj

and split, neither Pi nor Pj have more than (1–α)wi work.

  • For each load balancing strategy, we define V(P) as the total number of

work requests after which each processor receives at least one work request (note that V(p) ≥ p).

  • Assume that the largest piece of work at any point is W.
  • After V(p) requests, the maximum work remaining at any processor is less

than (1–α)W; after 2V(p) requests, it is less than (1–α)2W; …

  • After (log1/1(1- α )(W/ε))V(p) requests, the maximum work remaining at any

processor is below a threshold value ε.

  • The total number of work requests is O(V(p) log W).
slide-34
SLIDE 34

www.gc.cuny.edu

Travelling Salesman Parallel

  • If tcomm is the time required to communicate a piece of work, then

the communication overhead TO is TO = tcommV(p)log W

  • The corresponding efficiency E is given by:
slide-35
SLIDE 35

www.gc.cuny.edu

Travelling Salesman Parallel

  • Random Polling

§ Worst case V(p) is unbounded. § We do average case analysis.

  • Let F(i,p) represent a state in which i of the processors have been

requested, and p–i have not.

  • Let f(i,p) denote the average number of trials needed to change from

state F(i,p) to F(p,p) (V(p) = f(0,p)).

slide-36
SLIDE 36

www.gc.cuny.edu

Travelling Salesman Parallel

  • We have

As p becomes large, Hp ≃ 1.69 ln p. Thus, V(p) = O(p log p). To = O(p log p log W) Therefore W = O(p log2p).

slide-37
SLIDE 37

www.gc.cuny.edu

END OF PRESENTATION

THANK YOU !