Building a Distributed Genetic Algorithm with the Jini Network - - PowerPoint PPT Presentation

building a distributed genetic algorithm with the jini
SMART_READER_LITE
LIVE PREVIEW

Building a Distributed Genetic Algorithm with the Jini Network - - PowerPoint PPT Presentation

Building a Distributed Genetic Algorithm with the Jini Network Technology Brian Zorman (Gregory M. Kapfhammer and Robert Roos) Sixth Annual Jini Community Meeting Boston June 17-20, 2002 Problem Analysis Genetic Algorithms: Pros:


slide-1
SLIDE 1

Building a Distributed Genetic Algorithm with the Jini Network Technology

Brian Zorman

(Gregory M. Kapfhammer and Robert Roos)

Sixth Annual Jini Community Meeting Boston • June 17-20, 2002

slide-2
SLIDE 2

Problem Analysis

  • Genetic Algorithms:

– Pros: robust and efficient – Cons: execution cost and Quality of Solution (QoS)

  • Possible solution: how can we harness the benefits of

distributed computing frameworks?

  • Can we reduce cost of execution and improve quality of solution

with a distributed genetic algorithm (DGA)?

slide-3
SLIDE 3

Bridging the Gap: Distributed Genetic Algorithms

Genetic Algorithms: 1.) Execution cost 2.) Lack of diversity Distributed Systems: 1.) Resource Sharing 2.) Concurrency 3.) Scalability 4.) Openness

slide-4
SLIDE 4

Exploring Punctuated Equilibrium

  • The theory of punctuated equilibrium:

– An isolated environment can reach a point of stability – The injection of new individuals could cause rapid evolution

  • Could we design a distributed system to simulate this theory?
  • How can the Jini network technology and the JavaSpaces object

repository help us to build this distributed system?

slide-5
SLIDE 5

Designing the Models

  • Examined two popular models:

master-worker and island

  • Chose combination of master-

worker and island models

– Master-worker: parallel execution and simplicity – Island model (punctuated equilibrium): parallel execution and additional diversity

Master Worker Worker

. . .

I1 I2 I3 I5 I4

parents parents evaluated

  • ffspring
slide-6
SLIDE 6

High Level Architecture: Entities in the “Simple” Model DistributionSpace DiversitySpace RM1 RM2 RM3 RMn

. . .

Initial Machine

slide-7
SLIDE 7

“Simple” Model: Distribution Phase

DistributionSpace DiversitySpace RM1 RM2 RM3 RMn

. . .

Initial Machine

slide-8
SLIDE 8

“Simple” Model: Pre-migration

DistributionSpace DiversitySpace RM1 RM2 RM3 RMn

. . .

Initial Machine

slide-9
SLIDE 9

“Simple” Model: Migration

DistributionSpace DiversitySpace RM1 RM2 RM3 RMn

. . .

Initial Machine

slide-10
SLIDE 10

“Simple” Model: Post-convergence

DistributionSpace DiversitySpace RM1 RM2 RM3 RMn

. . .

Initial Machine

slide-11
SLIDE 11

Simple Model Performance Bottleneck

  • No explicit synchronization between remote machines
  • Potentially, each remote machine could migrate with JavaSpace

at the same time!

  • In some sense, this causes each worker to “wait in line” in order

to perform migration!

  • While each worker is waiting there is no computation!
  • Designed “Complex” Distributed System Model (CDSM) in an

attempt to reduce this bottleneck

slide-12
SLIDE 12

High Level Architecture: Entities in the “Complex” Model Initial Machine DistributionSpace MM1 MM2 MMn MS1 MS2 MSn RM1 RM2 RMn

. . . . . .

. . .

slide-13
SLIDE 13

“Complex” Model: Distribution Phase

Initial Machine DistributionSpace MM1 MM2 MS1 MSn RM1 RM2

. . . . . .

MMn MS2 RMn

. . .

slide-14
SLIDE 14

“Complex” Model: Pre-migration

Initial Machine DistributionSpace MM1 MM2 MMn MS1 MS2 RM1 RM2 RMn

. . . . . .

MSn

. . .

slide-15
SLIDE 15

“Complex” Model: First Migration Phase

Initial Machine DistributionSpace MM1 MM2 MMn MS1 MS2 MSn RM1 RM2 RMn

. . . . . . . . .

slide-16
SLIDE 16

“Complex” Model: Subsequent Migration Phases

Initial Machine DistributionSpace MM1 MM2 MMn MS1 MS2 MSn RM1 RM2 RMn

. . . . . . . . .

slide-17
SLIDE 17

“Complex” Model: Post-convergence

Initial Machine DistributionSpace MM1 MM2 MMn MS1 MS2 MSn RM1 RM2 RMn

. . . . . . . . .

slide-18
SLIDE 18

“Complex” Model Observations

  • Maintains the functionality of the “Simple” model
  • Requires dedicated MigrationMachines and MigrationSpaces
  • Explicit synchronization mechanism used so that chances of

more than one remote machine migrating with the same JavaSpace at the same time is greatly reduced

  • Multiple MigrationSpaces minimally reduce the overall diversity

that any given remote machine has access to; however, this cost is small when compared to other gains!

slide-19
SLIDE 19

Experimental Framework

  • Goal: analyze the design and performance of the two models,

and then compare the best version to sequential GA

  • Selected open source GA written in Java that “solves” the

Knapsack Problem

– Knapsack problem is provably NP-complete

  • Knapsack Problem Statement: Given a set of weights and

knapsack capacity: find best combination of weights that fit inside the knapsack

slide-20
SLIDE 20

Testbench Description

  • 8 testsets of increasing levels of

difficulty

  • Range of weight values:

0 – 5000

  • Number of weights:

500 – 1200

  • Number of machines

– SDSM: {2,4,6,8}

  • Requires RemoteMachines

– CDSM: {2,4,6,8}

  • Requires RemoteMachines,

MigrationMachines, MigrationSpaces

  • GA parameters:

– Termination condition: best solution remains constant after 75 generations – Crossover: at every generation – Mutation: at every generation – Migration: 30% of population every 30 generations, starting at generation 60

slide-21
SLIDE 21

Measurements and General Observations

  • Execution time: The CDSM reduces the execution time of the DGA

when compared to the SDSM. Generally, overall execution time increases as we add machines to the CDSM.

  • Computation–to–Communication ratio: CDSM increases this ratio

when compared to the SDSM. The addition of machines to the CDSM reduces this ratio.

  • Diversity: The potential for a higher quality solution increases as we

move from the SGA to the CDSM and then as we add more machines to the CDSM.

  • Quality of Solution: The QoS for the CDSM is always higher than the
  • SGA. Generally, the QoS is higher in the CDSM as we add machines.
  • Generations–per–Second: The CDSM can compute more Gen/Sec

than the SDSM. Generally, adding more machines to the CDSM increases the Gen/Sec.

slide-22
SLIDE 22

SDSM vs. CDSM: Execution time

200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 2000000 2 4 6 8 SDSM CDSM

slide-23
SLIDE 23

SDSM vs. CDSM: Computation-to-Communication Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 SDSM CDSM

slide-24
SLIDE 24

SDSM vs. CDSM: Generations/Second

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 2 4 6 8 SDSM CDSM

slide-25
SLIDE 25

CDSM vs. SGA: Quality of Solution

10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 SGA 2 mach. 4 mach. 6 mach. 8 mach.

slide-26
SLIDE 26

CDSM vs. SGA: Execution Time

100000 200000 300000 400000 500000 600000 700000 1 2 3 4 5 6 7 8 SGA 2 mach. 4 mach. 6 mach. 8 mach.

slide-27
SLIDE 27

CDSM vs. SGA: Computation-to-Communication

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1 2 3 4 5 6 7 8 2 mach. 4 mach. 6 mach. 8 mach.

slide-28
SLIDE 28

CDSM vs. SGA: Population Diversity

500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 5000000 1 2 3 4 5 6 7 8 SGA 2 mach. 4 mach. 6 mach. 8 mach.

slide-29
SLIDE 29

CDSM vs. SGA: Generations-per-Second

1 2 3 4 5 6 1 2 3 4 5 6 7 8 SGA 2 mach. 4 mach. 6 mach. 8 mach.

slide-30
SLIDE 30

Future Possibilities: Distributed GA Framework

  • Potential advantages of a DGA framework:

– Could be integrated into existing Java GA frameworks – Java provides GA portability across operating systems – Jini and JavaSpaces offer openness, scalability, fault tolerance – GA developers could easily distribute their GA just to “see what happens”

  • DGA framework would require an approach for automatically and

transparently starting and terminating remote workers

  • Various users should be able to donate their resources; our DGA can

make use of “idle time” on various university machines

  • Potentially, we could develop simple applet for visibility and learning
slide-31
SLIDE 31

Concluding Remarks

  • Investigated feasibility of using Jini and JavaSpaces to build a

distributed genetic algorithm

  • Proposed, implemented, and empirically evaluated a simple and a

complex distributed system model (SDSM and CDSM)

  • SDSM bottleneck was a serious concern that prompted the

investigation of a new model that removed JavaSpaces interaction bottlenecks

  • CDSM outperformed SGA in quality of solution, diversity, and

generations per second

  • SGA only outperformed CDSM in execution time (mostly due to early

convergence)