Distributed Bingo 1 7 -6 5 4 : Analysis of Softw are Artifacts 1 8 - - PowerPoint PPT Presentation

distributed bingo
SMART_READER_LITE
LIVE PREVIEW

Distributed Bingo 1 7 -6 5 4 : Analysis of Softw are Artifacts 1 8 - - PowerPoint PPT Presentation

Distributed Bingo 1 7 -6 5 4 : Analysis of Softw are Artifacts 1 8 -8 4 1 : Dependability Analysis of Middlew are Team 5: Jack Yao Bubba Beasley Kai Liao Alex Berendeyev http: / / www.ece.cmu.edu/ ~ ece846/ team5 2 Alex Berendeyev Bubba


slide-1
SLIDE 1

Distributed Bingo

1 7 -6 5 4 : Analysis of Softw are Artifacts 1 8 -8 4 1 : Dependability Analysis of Middlew are

Team 5:

Jack Yao Bubba Beasley Kai Liao Alex Berendeyev

http: / / www.ece.cmu.edu/ ~ ece846/ team5

slide-2
SLIDE 2

2

Team Members

Bubba Beasley Alex Berendeyev Kai Liao

slide-3
SLIDE 3

3

Real-World Demonstration

slide-4
SLIDE 4

4

Real-World Characteristics

Each player gets a Bingo card to start A player joining mid-game can catch up with knowledge of previous draws The host announces each draw The winning player announces Bingo The host verifies the win The host announces the win

slide-5
SLIDE 5

5

Baseline Application

A distributed, online version of Bingo The clients pull data from the servers and the servers push data to the clients DB: SQL Server 2000, Windows XP, MSE Cave machine Servers: JBoss (JNDI, JMS) on Linux, ECE Game cluster Clients: Windows, Linux (theoretically anywhere) Automated command-line client Interactive GUI-based client

slide-6
SLIDE 6

6

Publish/ Subscribe with JMS

JMS

Server Server Client Client Client

JMS

Key JMS Connection

Server

Client

JMS

slide-7
SLIDE 7

7

Baseline Architecture

BingoServer

AS HS JMS JNDI

BingoClient

DBServer Join() GetSnapshot() DeclareBingo()

Key Remote Method Call JMS Connection DB Call

Server

Client

DBServer

slide-8
SLIDE 8

8

From the Real to the App

Each player gets a Bingo card to start A player joining mid-game can catch up with knowledge of previous draws

The Client asks the AnsweringServer to join and receives a bingo card and all previous draws.

The host announces each draw

At regular intervals (5 seconds), the HostServer broadcasts the draws via the JMS.

slide-9
SLIDE 9

9

From the Real to the App

The winning player announces Bingo The host verifies the win The host announces the win

The Client asks the AnsweringServer to verify Bingo The AnsweringServer verifies and stores the winner's ID in the database The HostServer checks for a winner in the DB and broadcasts that there is a win

slide-10
SLIDE 10

10

GUI Interface

Java GUI on top of a Java command-line interface

slide-11
SLIDE 11

11

Miscellany

Each game: 100 draws, rather than 75 Approximately 1.4 x 1030 card combinations (1.4 billion trillion trillion cards) Duplicate cards are not a problem, so theoretically no limit on the number of players in a single game No guarantee of fairness in declaring a winner

slide-12
SLIDE 12

12

Fault Tolerance Goals (1)

Server Faults

  • JBoss Process crashes
  • Machine crashes

Netw ork Faults “Sacred Machine” Assum ptions

  • Replication Manager
  • Global Fault Detector
  • DB Server

Replicas N replicas Tested with 3 replicas Round-robin recovery of JBoss servers

slide-13
SLIDE 13

13

BingoClient BingoClient BingoServer

FT Baseline Architecture

BingoServer

AS HS JMS JNDI

BingoClient

DBServer

RepMan FD (HealthChecker class)

One JBoss server up JBoss Servers

  • n replicas are

not running One Connection to the primary JMS Server

Key Remote Method Call JMS Connection DB Call SSH Connection

Server

Client

DBServer

slide-14
SLIDE 14

14

Baseline Fail-Over Measurements

FT Baseline Fail-Over 10 20 30 40 50 60 70 80 90 50 100 150 200 Second FD SN

Slow Server Fast Server

Number of faults injected Time (seconds)

slide-15
SLIDE 15

15

Baseline Fail-Over Measurements

FT Baseline Client 20 40 60 80 100 120 1000 2000 3000 4000 5000 Second Time B/w Messages

Slow Server Fast Server

Time (seconds) Number of faults injected

slide-16
SLIDE 16

16

BingoClient BingoClient BingoServer

RT-FT Optimized 1 Architecture

BingoServer

AS HS JMS JNDI

BingoClient

DBServer

All JBoss servers on replicas are up

Pre-established Connections to all JMS Servers

Key Remote Method Call JMS Connection DB Call SSH Connection

Server

Client

DBServer

RepMan FD (HealthChecker class)

slide-17
SLIDE 17

17

RT Optimized 1 Fail-Over Measurements

RT Optimization 1 Fail-Over 5 10 15 20 25 30 10 20 30 40 50 60 70 80 90 100 Second FD SN

Slow Server Fast Server

Time (seconds) Number of faults injected

slide-18
SLIDE 18

18

RT-FT Optimized 2 Architecture

BingoClient BingoClient BingoServer BingoServer

AS HS JMS JNDI

BingoClient

DBServer

RM__FD

FD runs as a daemon

Key Remote Method Call JMS Connection DB Call SSH Connection

Server

Client

DBServer

slide-19
SLIDE 19

19

RT Optimized 2 Fail-Over Measurements

RT Optimization 2 Fail-Over (Repman=1s, HealthCheckerDaemon=0.5s) 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5 10 15 20 25 30 Second FD SN

Number of faults injected Time (seconds)

slide-20
SLIDE 20

20

BingoClient BingoClient BingoServer

RT-FT Optimized 3 Architecture

BingoServer

AS HS JMS JNDI

BingoClient

DBServer

Local FD RM__FD

FD runs as a daemon + Local FD

Key Remote Method Call JMS Connection DB Call SSH Connection

Server

Client

DBServer

slide-21
SLIDE 21

21

RT Optimized 3 Fail-Over Measurements

RT Optimization 3 Fail-Over (Repman=0.1s, Local Checker=0.4s) 0.5 1 1.5 2 2.5 3 3.5 5 10 15 20 25 30 35 40 45 Second FD SN

Time (seconds) Number of faults injected

slide-22
SLIDE 22

22

Measurements Insights (1)

0.1 0.4 0.7 1.0 1.3 1.6 1.9

0.4 (MAX) 0.4 (AVG) 0.4 (MIN) 0.7 (MAX) 0.7 (AVG) 0.7 (MIN) 1.0 (MAX) 1.0 (AVG) 1.0 (MIN)

2 4 6 8 FD (Second)

Local Checker Period (Second) Repman Period (Second)

slide-23
SLIDE 23

23

Measurements Insights (2)

0.1 0.4 0.7 1 1.3 1.6 1.9 0.4 (A V G ) 0.7 (A V G ) 1.0 (A V G ) Repman Period (Second) Local FD Period (Second) 0.4 (AVG) 0.7 (AVG) 1.0 (AVG)

Average Fault Detection Time Comparison

Local FD= 1 .0 s Repman= 0 .1 s Local FD= 0 .7 s Repman= 1 .0 s FD= 3.1s

Real Time Tuning:

  • Repman period
  • Local FD period
  • Broadcast period

Repman Period (Second) Local FD Period (Second)

slide-24
SLIDE 24

24

Lessons Learned

  • Potentially Publish/ Subscribe (JMS) can hide

server errors from clients

Our Push Architecture

FD Start New Replica

Typical Pull Architecture

FD Start New Replica Reconnection

  • JBoss Server should be run in the minimal configuration.

(default configurations are not suited for RT)

slide-25
SLIDE 25

25

Questions?