Out-of-band Flow Control for Reliable Multica st 00S-SIW-071 Harry - - PowerPoint PPT Presentation

out of band flow control
SMART_READER_LITE
LIVE PREVIEW

Out-of-band Flow Control for Reliable Multica st 00S-SIW-071 Harry - - PowerPoint PPT Presentation

Out-of-band Flow Control for Reliable Multica st 00S-SIW-071 Harry Wolfson < HarryWolfson@LL.MI T.EDU > MIT Li ncoln Laboratory March 30, 2000 MIT L inco ln Lab oratory 1 hw 30-Mar-00 Outline Introd ucti on Ti me Management I


slide-1
SLIDE 1

MIT Lincoln Laboratory

1 hw 30-Mar-00

Out-of-band Flow Control for Reliable Multicast

00S-SIW-071

Harry Wolfson

<HarryWolfson@LL.MIT.EDU>

MIT Lincoln Laboratory

March 30, 2000

slide-2
SLIDE 2

MIT Lincoln Laboratory

2 hw 30-Mar-00

Outline

  • Introduction
  • Time Management Issues
  • Review of RTI 1.3 Multicast
  • Throughput Processing Imbalance
  • Flow Control Design in RTI 1.3
  • Summary
slide-3
SLIDE 3

MIT Lincoln Laboratory

3 hw 30-Mar-00

Introduction

  • RTI Reliable Multicast began with

STOW Program

– Design emphasis on low latency, high throughput performance – Large buffers accommodated bursty traffic – “Last resort” / Anomalous behavior

‘Drop messages instead of locking up

slide-4
SLIDE 4

MIT Lincoln Laboratory

4 hw 30-Mar-00

Introduction (cont)

  • RTI 1.3 R

eliable Multicast required:

– 100% Reliability

‘No dropped messages

– Adherence to tick (min, max)

‘Limited time permitted for processing

messages

‘Receive Queues added for temporary storage

slide-5
SLIDE 5

MIT Lincoln Laboratory

5 hw 30-Mar-00

Time Management Issues

  • Flow Control has no impact on Time

Managed Federations

– Wall clock time might slow down / pause

  • Real Time Federations (Hardware in the Loop)

– People responsible for planning Execution need to provide adequate communication / computational resources to match Scenario

slide-6
SLIDE 6

MIT Lincoln Laboratory

6 hw 30-Mar-00

Review of RTI 1.3 Multicast

  • Multicast based on Reliable Distributor

– Client / Server – Similar to “Exploder”

  • No “production / release” quality

Reliable Multicast available during STOW and 1.3 development

– Many to many multicast – Dynamic Join / Leave – Support DDM Interest Management

slide-7
SLIDE 7

MIT Lincoln Laboratory

7 hw 30-Mar-00

Review of RTI 1.3 Multicast (cont)

  • Built on top of TCP / IP

– Sequential message delivery – Reliable point to point msg transfer

  • TCP does NOT provide Application-to-

Application Flow Control

  • Robust, fault tolerant
  • Interest Filtering (DDM) implemented at

sender, exploder and receiver

slide-8
SLIDE 8

MIT Lincoln Laboratory

8 hw 30-Mar-00

Federate RTI tcp RD RTI

tcp RD

Federate LAN #3 Federate RTI RD

tcp

Federate

RTI tcp

RD Federate

RTI tcp

RD Federate RTI tcp RD LAN #2 RTI tcp RD Federate LAN #1 Federate

RTI tcp

RD

Review of RTI 1.3 Multicast (cont)

RD = Reliable Distributor Server

(ie. Exploder)

tcp = TCP Connection Client

slide-9
SLIDE 9

MIT Lincoln Laboratory

9 hw 30-Mar-00

Federate

RTI tcp

RD Federate

RTI tcp

RD Federate RTI tcp RD LAN #2 Federate RTI tcp RD RTI

tcp RD

Federate LAN #3 Federate RTI RD

tcp

RTI tcp RD Federate LAN #1 Federate

RTI tcp

RD

Review of RTI 1.3 Multicast (cont)

RD = Reliable Distributor Server

(ie. Exploder)

tcp = TCP Connection Client

slide-10
SLIDE 10

MIT Lincoln Laboratory

10 hw 30-Mar-00

Review of RTI 1.3 Multicast (cont)

RD = Reliable Distributor Server

(ie. Exploder)

tcp = TCP Connection Client

Federate

RTI tcp

RD Federate

RTI tcp

RD Federate RTI tcp RD LAN #2 Federate RTI tcp RD RTI

tcp RD

Federate LAN #3 Federate RTI RD

tcp

RTI tcp RD Federate LAN #1 Federate

RTI tcp

RD

slide-11
SLIDE 11

MIT Lincoln Laboratory

11 hw 30-Mar-00

Review of RTI 1.3 Multicast (cont)

RD = Reliable Distributor Server

(ie. Exploder)

tcp = TCP Connection Client

Federate

RTI tcp

RD Federate

RTI tcp

RD Federate RTI tcp RD LAN #2 Federate RTI tcp RD RTI

tcp RD

Federate LAN #3 Federate RTI RD

tcp

RTI tcp RD Federate LAN #1 Federate

RTI tcp

RD

slide-12
SLIDE 12

MIT Lincoln Laboratory

12 hw 30-Mar-00

TCP Alone is NOT Sufficient

– Single connected pair would require “Blocking Send” – Exploder breaks any end-to-end flow control provided by TCP

Federate RTI Federate RTI Federate RTI Federate RTI

Exploder

  • TCP / IP does NOT provide true

Application-to-Application Flow Control

slide-13
SLIDE 13

MIT Lincoln Laboratory

13 hw 30-Mar-00

Throughput Imbalance (a)

  • Many Federation Scenarios can lead to

Overloaded Network

High speed processor High volume data Slow processor

Federate RTI

Federate

RTI

slide-14
SLIDE 14

MIT Lincoln Laboratory

14 hw 30-Mar-00

Throughput Imbalance (b)

  • Many Federation Scenarios can lead to

Overloaded Network

Federate that receives data from numerous high volume Federates

Federate RTI

Federate RTI

Federate RTI Federate RTI

slide-15
SLIDE 15

MIT Lincoln Laboratory

15 hw 30-Mar-00

Throughput Imbalance (c)

  • Many Federation Scenarios can lead to

Overloaded Network

Receiving Federate at end of slow network link

Federate RTI

LAN #2

Federate RTI Federate RTI

LAN #1

slide-16
SLIDE 16

MIT Lincoln Laboratory

16 hw 30-Mar-00

Throughput Imbalance (d)

  • Overloaded Execution typical scenario

without “Application to Application” Flow Control

– Receiver can’t process incoming msgs – Buffers in LRCs and kernel begin to fill up – Federates, or entire Execution, slows to crawl as Federates try to clear buffers – Deadlock / deadly embrace

slide-17
SLIDE 17

MIT Lincoln Laboratory

17 hw 30-Mar-00

Flow Control Design in RTI 1.3

  • Regulate message throughput level

– Prevent Federates from sending new data based on RTI internal state

‘Squelch / ClearToSend ‘LRC grabs control of tick from Federate

– Hysteresis in system prevents “thrashing”

  • Out of Band handshake protocol

– Full TCP buffers do not impede protocol

slide-18
SLIDE 18

MIT Lincoln Laboratory

18 hw 30-Mar-00

Monitor Internal Queues

  • Each Federate’s LRC has two message

queues

– Receive Queue stores messages:

‘After tick(min,max) expires ‘During save / restore

– Send Queue stores messages:

‘After tick(min,max) expires ‘As remote LRC’s buffers fill across Execution

slide-19
SLIDE 19

MIT Lincoln Laboratory

19 hw 30-Mar-00

Monitor Internal Queues (cont)

  • Reliable Distributor message queues

– One Send Queue for each Federate Client – One Send Queue for each remote Reliable Distributor – Stores messages if clients’ buffers full

  • Squelch Mode activated when any

Queue exceeds threshold

slide-20
SLIDE 20

MIT Lincoln Laboratory

20 hw 30-Mar-00

Out-of-Band Messaging Link

  • Squelch / ClearToSend internal state

communicated between RTI pairs

– Federate to Reliable Distributor – Reliable Distributor to Reliable Distributor

  • UDP “point to point” link between pairs

– Best Effort – Not subject to TCP congestion – Heartbeat Status (fail-safe time out)

slide-21
SLIDE 21

MIT Lincoln Laboratory

21 hw 30-Mar-00

Out-of-Band Link (cont)

  • UDP link in parallel to every TCP

connection

Federate RTI tcp RD

Federate tcp RTI

Federate RTI tcp

slide-22
SLIDE 22

MIT Lincoln Laboratory

22 hw 30-Mar-00

Squelch Mode

  • Federate’s LRC enters Squelch Mode if:

– Rcv Msg Queue over threshold, or – Snd Msg Queue over threshold, or – Received Remote Squelch message

‘or: ClearToSend times out

RTI tcp Snd Msg Queue Rcv Msg Queue Federate Remote ClearToSend

slide-23
SLIDE 23

MIT Lincoln Laboratory

23 hw 30-Mar-00

Squelch Mode (cont)

  • Reliable Distributor enters Squelch

Mode if any of its Clients:

– Snd Msg Queue over threshold – Received Remote Squelch message

‘or: ClearToSend times out

RTI

Remote ClearToSend tcp tcp Snd Msg Queue

tcp

Reliable Distributor

slide-24
SLIDE 24

MIT Lincoln Laboratory

24 hw 30-Mar-00

Squelch Mode (cont)

  • Federate prevented from sending new

messages when LRC in Squelch Mode

– LRC grabs control of tick until ClearToSend

  • Built in hysteresis / latency prevents

“thrashing”

  • ff
  • n

Squelch Threshold buffer state

Receive Message Queue

  • ff
  • n

Squelch Threshold buffer state

Send Message Queue(s) Remote RTI Squelch Message

CTS SQ Message Received

slide-25
SLIDE 25

MIT Lincoln Laboratory

25 hw 30-Mar-00

Summary

  • End-to-end, “Application to Application”,

coordinated, Flow Control implemented in RTI 1.3v7

  • Prevents new messages from being sent

when slower Federates can’t keep up

– Allows sustainable message throughput

slide-26
SLIDE 26

MIT Lincoln Laboratory

26 hw 30-Mar-00

Summary (cont)

  • Demonstrated system wide improvement

in large Simulations

– JTC, JTLS

‘9 - 11 Federates, ~15,000 objects

– “Reduced” Load Test

‘5 Federates; 10,000 objects; updated every tick

  • Time Managed Simulations not impacted
  • Real Time requires adequate resources