DAAPR: The Dapper Way to Debrief Together Alexander Streit 1 1 - - PDF document

▶

Nov 12, 2022 126 likes •165 views

IT 2 EC 2020 IT 2 EC Extended Abstract Template Presentation/Panel DAAPR: The Dapper Way to Debrief Together Alexander Streit 1 1 Director of Advanced Technologies, PLEXSYS Interface Products, Orlando, USA Abstract Simulation events are

SLIDE 1

IT2EC 2020 IT2EC Extended Abstract Template Presentation/Panel

DAAPR: The Dapper Way to Debrief Together

Alexander Streit1

1Director of Advanced Technologies, PLEXSYS Interface Products, Orlando, USA

Abstract — Simulation events are increasingly taking place across multiple platforms, sites, and organisations. The Distributed Debrief Control Architecture (DDCA) is the industry standard for distributed debriefing. However, it has a number of issues. We describe the issues and their workarounds in the paper. We have designed Distributed After Action Playback and Review (DAAPR) as a fault-tolerant, easy to implement, and extensible alternative to DDCA. We have produced a prototype implementation and conducted numerous experiments. This paper presents observations and results from our experience developing and testing DAAPR. We invite others to join in implementing and standardising DAAPR for global simulation use.

1 Introduction

The UK Land Forces are designing their training system to enable a “Train, Reflect, Learn, and Train Again (TRLTA)” cycle [1]. Debriefing is a critical part of this cycle by allowing participants in an exercise to reflect on what occurred, so as to learn from it. Simulation exercises are increasingly taking part across multiple sites and across different training systems. This poses a challenge when debriefing sessions, as the tools available between systems vary and often do not coordinate with one another. This is a simulation and training challenge for commercial and defence users alike. One solution is the Distributed Debrief Control Architecture (DDCA), an architecture for coordinating a distributed debrief [2]. DDCA does not specify a network protocol and messages can be transported inside Distributed Interactive Simulation (DIS), High-Level Architecture (HLA), or Test and Training Enabling Architecture (TENA) envelopes. The Boeing Company and others developed DDCA with inspiration drawn from Distributed Debrief Control Protocol (DDCP), an earlier protocol defined by Boeing [3]. DDCP has been adapted to work on the US Air Force’s Combat Air Force Distributed Mission Operations Network (CAF DMON) [4]. We initially pursued DDCA to solve our needs to debrief across heterogeneous environments. However, we encountered several issues, which we describe below. 1.1 Issues with DDCA There are several issues with the DDCA specification:

Network instability has a strongly adverse effect on

the ability to conduct a debriefing. For example, dropped packets can cause the master application to wait on response messages to seek requests before transitioning to a play state.

Excess (non-critical) information is transferred,

making it problematic for sensitive environments or where multiple levels of security are involved.

Transfer of replay control is a required feature that

is not part of the specification.

Playback speeds are specified using Integers.

1.2 DDCA Benefits There are several aspects of the DDCA design that are useful in distributed and interconnected use cases. These include:

Zulu timestamping of message generation time

enables dead-reckoning and provides a helpful sanity check.

The use of 64-bit milliseconds from Epoch at Zulu

time is an appropriate mechanism for time

exchange. A 64-bit signed Integer provides an

almost 585 million year time range with specificity to the millisecond.

The SYNC message includes enough information

for a client to establish the correct playback point and speed in the presence of lag through dead reckoning.

DDCA is an architecture and can be embedded

inside another protocol, such as DIS or HLA.

2 Approach

Our initial approach was to adapt DDCA to our needs. This included the following workarounds:

Response to LIST commands with a single infinite

recording with no description, to maintain privacy controls.

Ignore LOAD commands, as the data file would

always be the single infinite file. The work-around is to perform on-demand loading of data files using the playback time from commands such as SEEK or SYNC.

Ignore ACKNOWLEDGE messages, allowing state

to transition directly between PAUSE, SYNC, and

SEEK. Devices would be responsible for catching

up using dead-reckoning. This reduces the effect of network lag and instability.

SLIDE 2

IT2EC 2020 IT2EC Extended Abstract Template Presentation/Panel

Use the DDCA-X mechanism to add a command to

specify fractional playback speeds.

Use the DDCA-X mechanism to add a “current

leader” concept that allowed a device to transmit controls to the master application. Devices could request the leadership status.

Filter ERROR messages to strip the free-form text

field, to maintain privacy. The number of workarounds needed proved to be a significant burden on implementation, thus leading us to reconsider the problem from first principles. The result is the design of the DAAPR protocol. 2.1 Design of the DAAPR protocol We approached the DAAPR protocol[5] with three main factors in mind: privacy, simplicity, and

extensibility. A focus on privacy means minimizing

exposure of unnecessary data and avoiding variable length

r free-form fields where possible. Simplicity encourages

adoption by being easy to implement and understand. Extensibility allows the system to be expanded and adapted to evolving needs. Our approach involved an iterative process of prototype development followed by testing and feedback. During each phase feedback was incorporated back into the design and prototype, and retested. We chose a Client-Server architecture. A DAAPR server distributes heat-beat messages it receives from

clients. Clients process these messages when they wish to

follow and send these messages to the server when they wish to lead. Clients are responsible for loading their own data files in preparation for debrief. The heartbeat message is an adaptation of the SYNC message from DDCA. The playback speed is changed from an 8-bit integer to a 32-bit IEEE float and the state field is removed. Thus the new SYNC message contains the time of message generation, current playback time, and playback speed. The client must be able to interpret Zulu time in milliseconds from epoch (1970-01-01 00:00:00). Where the client has loaded debriefing files that are not within the scope of the current debriefing time, they shall perform appropriate behaviour, such as on-demand loading or displaying a message indicating that no data is available. To prevent clashes over leadership of the debriefing session, a leadership token is used. There is only a single leadership token in a playback system. The leadership token can be held by the Server, in which case there is no leader, or by a Client, in which case that client is leading the playback. A client may have one of three states: Leader: The client is leading the debriefing, meaning that it controls the shared time, Follower: The client is following the debriefing, meaning that it is observing the shared time, Independent: The client is not following the debriefing, meaning that it is controlling its own independent time and is ignoring the shared time. We considered using a textual format, such as JSON or

XML. This would have fit with our other APIs that are

typically REST-like. However, this makes it far more computationally expensive to validate and filter packets. For this reason the message definition is strictly defined and binary. Malformed messages are easily identified. Additionally, the core set of messages explicitly avoids textual fields. This makes deep packet inspection filtering easier.

3 Results and Discussion

Testing included simulated fault, such as lag spikes and unexpected disconnects. Playback was correctly synchronised in the presence of lag, with the lag being felt

nly in the response time to commands. In some cases this

results in seeking to catch up in time, such as a delayed response to pause or fast forward commands. We tested both hard and soft network disconnect. Hard disconnects include physically pulling the network such that the TCP FIN messages are not received. This type of disconnect can take up to the TCP timeout to detect. Soft disconnects include forcibly closing the process, causing the host operating system to cleanly close the connection. Under both disconnect conditions the leadership token was correctly handled and the other participants could continue the session. The system assumes that clients are using the correct

time. We deliberately set incorrect times on some clients

by disabling NTP and setting the clock. This resulted in their debriefing sessions being offset by the clock

difference. It was easily corrected by restoring the correct

time. Clients were able to “drop-in drop-out”, joining and leaving sessions without affecting the other participants. The leadership token reverts to the server when the leader leaves the session, allowing another client to become the

leader. If another client leaves or joins there is no effect on

existing clients. Some debriefing systems require time to spool up. This is particularly true where many video, audio, and data streams are being played simultaneously. In these instances the client may apply a spool time term to the dead reckoning system. This allows the system to spool in a future time and then wait to play once the playback time has caught up. Some debriefing systems do not support all playback

speeds. For example, playback of raw DIS traffic should
nly performed at real-time speed (1.0x) to avoid conflicts

with DIS dead reckoning algorithms and to prevent

verloading DIS clients. In these instances there are

multiple techniques that can be applied:

Play at the nearest supported speed and regularly

seek forward/back to catch up. While this results in skipping, it can provide good context for what is

happening. A note of caution here is that seeking

with a DIS stream must be done carefully as certain messages, such as transfer of control, cannot be skipped.

Suspend playback until it returns to a supported
speed. This results in a paused stream until a

supported speed is set.

Pause playback and regularly seek forward/back to

catch up. Particularly for visual systems, this can be

SLIDE 3

IT2EC 2020 IT2EC Extended Abstract Template Presentation/Panel an effective way to communicating the current position without overloading a system that cannot support the playback speed.

A combination of the above methods. A

combination may be appropriate where, for example, at 1/100th speed it suspends playback, but at 100x speed it pauses with seek. DAAPR itself allows for any speed to be defined within 32bit floating point precision. It does not specify how the client must accommodate these speeds.

4 Conclusions and Future Work

We required a multi-site, multi-system debriefing protocol that also respected limited release information requirements for commercial and defence customers

globally. We first investigated DDCA and found several

issues for our use case. We designed DAAPR as a fault-tolerant, easy to implement, and extensible alternative. We have produced a prototype reference implementation and conducted numbers experiments. DAAPR meets our immediate distributed debriefing needs while remaining minimally restrictive. For future work, we aim to open source the prototype implementation as a reference implementation. We invite interested parties to join in implementing and standardising DAAPR for global simulation use. Acknowledgements Peter McThompson, Sr Software Engineer, and Todd DeCosta, Sr Architect at PLEXSYS have provided valuable feedback and input to this work.

References

[1] ITEC Ltd, “Conference Theme – IT2EC 2020 – Training and education technologies for military and civil protection communities”, [Online] Available: https://www.itec.co.uk/key-themes. [Accessed 12 Oct 2019] (2019) [2] SISO, “Standard for Distributed Debrief Control Architecture (SISO-STD-015-2016)”, SISO, Inc., Orlando (2016) [3] R. Pitz and C. Armstrong, “Advanced Distributed Debrief for Joint and Coalition Training”, in Interservice/Industry Training Simulation and Education Conference (I/ITSEC), Orlando (2007) [4] T. McDermott, L. Ashkar, T. Knight, and R. Pitz, “Distributed Synchronized Playback Protocol and Implementation”, in Interservice/Industry Training Simulation and Education Conference (I/ITSEC), Orlando (2010) [5] A. Streit, “DAAPR: Distributed Debriefing in Heterogeneous Environments”, Simulation Innovation Workshop, 10-14 Feb, 2020, Orlando (to be published) Author/Speaker Biographies Alexander Streit is the Director of Advanced Technologies at PLEXSYS Interface Products, Inc. He received his Ph.D. from the Queensland University of Technology in Brisbane, Australia in 2007. In 2006 He co- founded ImmersaView and led the creation of the VADAAR debriefing software suite. ImmersaView merged with PLEXSYS in 2016. Alexander’s role is to foster innovation and seek out the next challenges in simulation and training.