timePlot Text mining support for analysis of aeronautical incident - - PowerPoint PPT Presentation

timeplot
SMART_READER_LITE
LIVE PREVIEW

timePlot Text mining support for analysis of aeronautical incident - - PowerPoint PPT Presentation

timePlot Text mining support for analysis of aeronautical incident reports Showcase of Electronic Tools (SET13) September 23 rd 2013 Safety-Data by CFH Group p. 2 Outline I. Partners and clients II. Context and Stakes III. Proposed


slide-1
SLIDE 1

Safety-Data by CFH Group

timePlot

Text mining support for analysis of aeronautical incident reports

September 23rd 2013

Showcase of Electronic Tools (SET13)

slide-2
SLIDE 2

Safety-Data by CFH Group

  • p. 2

I. Partners and clients

  • II. Context and Stakes
  • III. Proposed approach
  • IV. Demo
  • V. TimePlot analysis platform
  • VI. Other services
  • VII. About us

Outline

2013/09/23

slide-3
SLIDE 3

Safety-Data by CFH Group

  • p. 3
  • Tools designed and tuned in close collaboration with experts

in the field of aviation safety: – Administrations:

  • ICAO
  • KOTSA
  • DSNA (ATC)

– Manufacturers:

  • ATR
  • Airbus
  • The timePlot platform was designed in collaboration with:
  • DGAC (tool currently deployed and in service  200 users)
  • Air France (currently testing and validation)
  • EASA (ECCAIRS integration, currently testing)

I - Partners

2013/09/23

slide-4
SLIDE 4

Safety-Data by CFH Group

  • p. 4
  • Context: incident report databases

‒ Large and constantly growing repositories of safety related data. ‒ A lot of information is still in the form of free text. ‒ Coding/classification based strategies are complex and reductive. ‒ Processing this information by human experts is time-consuming and costly.

II - Context & Stakes

2013/09/23

slide-5
SLIDE 5

Safety-Data by CFH Group

  • p. 5
  • The stakes:

‒ How to make sense of the data as a whole? – How fully process information-rich content? – How to identify and extract recurrent events, without relying on coded data? – How to visualize and specific, rare or complex situations ? Our approach: text mining tools based on linguistic analysis of all textual content in safety reports.

II - Context & Stakes

2013/09/23

slide-6
SLIDE 6

Safety-Data by CFH Group

  • p. 6
  • Issue: Use text-mining techniques to process and organize

narrative data in incident reports.

  • Basic principle : model inter-occurrence similarity based on

the narrative content and visualized as a function of time ‒ Content similarity: occurrences describing the same phenomenon. ‒ Context similarity: Events occurring in similar contexts (airport/aircraft model/route) based on available coded data. ‒ Custom factors: Occurrences sharing similar causes (crew fatigue, situational awareness…).

III - Proposed approach

2013/09/23

slide-7
SLIDE 7

Safety-Data by CFH Group

  • p. 7
  • Online demo with an ASRS dataset:

– From 1988 to 2012, – ~ 167 000 occurrences.

https://services.safety-data.com/timeplot/prod/asrs/login.php

Login: demouser Password: demoASRS2013

IV - Demo

2013/09/23

slide-8
SLIDE 8

Safety-Data by CFH Group

  • p. 8
  • Example: For a given report, visualization of similar reports through time

V.1 - Overview

2013/09/23

slide-9
SLIDE 9

Safety-Data by CFH Group

  • p. 9

V.2 - Features (1): Select the pivot report

  • Selection of the reference

report in the reference database

‒ Search by query (keywords, type

  • f aircraft, date, etc.),

‒ Via report id.

  • Report given by the user via

a free text field.

2013/09/23

slide-10
SLIDE 10

Safety-Data by CFH Group

  • p. 10
  • “Post-similarity” ordering

V.2 - Features (2): Visualization

2013/09/23

slide-11
SLIDE 11

Safety-Data by CFH Group

  • p. 11
  • Selection of reports via customized factors

V.2 - Features (3): Analysis assistance

2013/09/23

slide-12
SLIDE 12

Safety-Data by CFH Group

  • p. 12
  • Principle: Automatic analysis of incoming reports in order to

identify specific incidents.

  • Alert selection:

‒ via a report describing an incident that one is looking to trace; ‒ via keywords; ‒ via the frequency of reports received for a particular type of equipment.

  • Basics:

– When the database is updated, a similarity analysis is automatically run through the new reports to identify those corresponding to the alerts set up; – The user receives an alert message indicating the alert concerned and the corresponding reports.

V.2 - Features (4): Alerts

2013/09/23

slide-13
SLIDE 13

Safety-Data by CFH Group

  • p. 13
  • Hosted web service (accessible over the internet, used

in a web-browser, logging via a user management module).

  • Available for English and French data, more

languages to come.

  • Data exchange interfaces with other environments

(ECCAIRS, ASRS-like databases, custom in-house solutions).

  • User-side integration with the ECCAIRS Browser.

V.3 - timePlot platform description

2013/09/23

slide-14
SLIDE 14

Safety-Data by CFH Group

  • p. 14
  • Data migration:

– All source data format conversion (doc, xls, xml…) to the format of a target database (e.g. ECCAIRS e4f/e5f). – Taxonomy migration (e.g. ASRS  ADREP).

  • Databases Quality & Coherence Analysis:

– Factual data completion based on the narratives and/or external resources (e.g. Type of aircraft  Mass group). – Duplicates identification through a content analysis. – Integration of a data flow by checking the quality of the data.

VI.1 - Other services (1)

2013/09/23

slide-15
SLIDE 15

Safety-Data by CFH Group

  • p. 15
  • Categorization assistance:

‒ Analysis of the taxonomy used and learning of the codification rules thanks to the existing data. ‒ Categorization assistance features integration in the user environment for entering/managing reports. ‒ Report internal coherence verification (text/categorization). ‒ Codification coherence and quality analysis in a report database (batch mode).

VI.1 - Other services (2)

2013/09/23

slide-16
SLIDE 16

Safety-Data by CFH Group

  • p. 16

VI.2 - Database life cycle, e.g. ECCAIRS

  • Environment initialization
  • Taxonomy migration set-up
  • Existing database translation
  • Database quality check
  • New reports integration:

 Expert manpower

  • Experts check-up:

Data quality check software from EASA

  • ECCAIRS user activities

2013/09/23

 ECCAIRS deployment  ECCAIRS/ADREP proficiency  Specific software development  Database quality check tools  Categorization assistance: – with SD ECCAIRS Add-in – in batch mode  Quality & completion analysis software for large databases  timePlot environment for: – Sorting and exporting data – Similarity analysis

Database initialization Database update Database Quality Check Database Usage

Phases Activities CFH-SD Supports

slide-17
SLIDE 17

Safety-Data by CFH Group

  • p. 17
  • CFH-Safety Data is a French SME:

– We are based in Toulouse, – We are an Aerospace Valley member (World Competitiveness Cluster in Aeronautics & Space), – We are developing tools and applications and providing services related to aviation safety data.

Website: www.safety-data.com

  • Contacts:

– Mika Andreou: mika@safety-data.com – Eric Hermann: hermann@safety-data.com – Céline Raynal: raynal@safety-data.com

VII - About us

2013/09/23