of Big (Military) Data Barry Smith Director National Center for - - PowerPoint PPT Presentation

of big military data
SMART_READER_LITE
LIVE PREVIEW

of Big (Military) Data Barry Smith Director National Center for - - PowerPoint PPT Presentation

Distri Distribu bute ted d Commo Common n Gro Groun und System d System Army Army (DCGS-A) The Role of Ontology in the Era of Big (Military) Data Barry Smith Director National Center for Ontological Research 1 Distributed


slide-1
SLIDE 1

Distri Distribu bute ted d Commo Common n Gro Groun und System d System – Army Army (DCGS-A)

Barry Smith Director National Center for Ontological Research

The Role of Ontology in the Era

  • f Big (Military) Data

1

slide-2
SLIDE 2

Distributed Development of a Shared Semantic Resource (SSR)

in support of US Army’s Distributed Common Ground System Standard Cloud (DSC) initiative with thanks to: Tanya Malyuta, Ron Rudnicki

Background materials: http://x.co/yYxN

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

Making data (re-)usable through common controlled vocabularies

  • Allow multiple databases to be treated as if

they were a single data source by eliminating terminological redundancy in ways data are described

– not ‘Person’, and ‘Human’, and ‘Human Being’, and ‘Pn’, and ‘HB’, but simply: person

  • Allow development and use of common tools

and techniques, common training, single validation of data, focused around

– semantic technology – coordinated ontology development and use

4

slide-5
SLIDE 5

Ontology =def.

  • controlled vocabulary organized as a graph
  • nodes in the graph are terms representing types

in reality

  • each node is associated with definition and

synonyms

  • edges in the graph represent well-defined

relations between these types

  • the graph is structured hierarchically via subtype

relations

5

slide-6
SLIDE 6

Ontologies

  • computer-tractable representations of types

in specific areas of reality

  • divided into more and less general

– upper = organizing ontologies, provide common architecture and thus promote interoperability – lower = domain ontologies, provide grounding in reality

  • reflecting top-down and bottom-up strategy

6

slide-7
SLIDE 7

Success story in biomedicine

Goal: integration of biological and clinical data

– across different species – across levels of granularity (organ,

  • rganism, cell, molecule)

– across different perspectives (physical, biological, clinical) – within and across domains (growth, aging, environment, genetic disease, toxicity …)

8

slide-8
SLIDE 8

RELATION

TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO)

The Open Biomedical Ontologies (OBO) Foundry

9

slide-9
SLIDE 9

RELATION

TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT COMPLEX OF ORGANISMS Family, Community, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO)

Population-level ontologies

10

slide-10
SLIDE 10

RELATION

TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO)

Environment Ontology

Environment Ontology

11

slide-11
SLIDE 11

CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO)

rationale of OBO Foundry coverage

GRANULARITY RELATION TO TIME

12

slide-12
SLIDE 12

OBO Foundry approach extended into

  • ther domains

13

NIF Standard Neuroscience Information Framework ISF Ontologies Integrated Semantic Framework OGMS and Extensions Ontology for General Medical Science IDO Consortium Infectious Disease Ontology cROP Common Reference Ontologies for Plants

slide-13
SLIDE 13

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*)

14

top level domain level Basic Formal Ontology (BFO)

Modular organization + Extension strategy

slide-14
SLIDE 14

~100 ontologies using BFO

US Army Biometrics Ontology Brucella Ontology (IDO-BRU) eagle-i and VIVO (NCRR) Financial Report Ontology (to support SEC through XBRL) IDO Infectious Disease Ontology (NIAID) Malaria Ontology (IDO-MAL) Nanoparticle Ontology (NPO) Ontology for Risks Against Patient Safety (RAPS/REMINE) Parasite Experiment Ontology (PEO) Subcellular Anatomy Ontology (SAO) Vaccine Ontology (VO) …

15

slide-15
SLIDE 15

Basic Formal Ontology

Thursday, April 18, 2013 16

BFO:Entity BFO:Continuant BFO:Occurrent BFO:Process BFO:Independent Continuant

BFO

BFO:Dependent Continuant BFO:Disposition

slide-16
SLIDE 16

Basic Formal Ontology and Mental Functioning Ontology (MFO)

Thursday, April 18, 2013 17

BFO:Entity BFO:Continuant BFO:Occurrent BFO:Process Organism BFO:Independent Continuant

BFO MFO

BFO:Dependent Continuant Behaviour inducing state Mental Functioning Related Anatomical Structure Cognitive Representation BFO:Quality Affective Representation Mental Process Bodily Process BFO:Disposition

slide-17
SLIDE 17

BFO:Entity BFO:Continuant BFO:Occurrent BFO:Process BFO:Independent Continuant

BFO MFO

BFO:Dependent Continuant Cognitive Representation Affective Representation Mental Process Bodily Process BFO:Disposition

MFO-EM

Emotion Occurrent Organism Emotional Action Tendencies Appraisal Subjective Emotional Feeling Physiological Response to Emotion Process

inheres_in is_output_of

Emotional Behavioural Process Appraisal Process

has_part agent_of

Emotion Ontology extends MFO

slide-18
SLIDE 18

Sample from Emotion Ontology: Types of Feeling

Thursday, April 18, 2013 19

slide-19
SLIDE 19

The problem of joint / coalition operations

Fire Support Logistics Air Operations Intelligence Civil-Military Operations Targeting

Maneuver & Blue Force Tracking 23

slide-20
SLIDE 20

US DoD Civil Affairs strategy for non-classified information sharing

24

slide-21
SLIDE 21

Ontologies / semantic technology can help to solve this problem

Fire Support Logistics Air Operations Intelligence Civil-Military Operations Targetin g Maneuver & Blue Force Tracking 25

slide-22
SLIDE 22

But each community produces its own ontology, this will merely create new, semantic siloes

Fire Support Logistics Air Operations Intelligence Civil-Military Operations Targeting

Maneuver & Blue Force Tracking 26

slide-23
SLIDE 23

What we are doing to avoid the problem of semantic siloes

Distributed Development of a Shared Semantic Resource Pilot testing to demonstrate feasibility

27

slide-24
SLIDE 24

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*)

28

top level domain level Basic Formal Ontology (BFO)

creating the analog of this in the military domain

slide-25
SLIDE 25

Semantic Enhancement

Annotation (tagging) of source data models using terms from coordinated ontologies

– data remain in their original state (are treated at arms length) – tagged using interoperable ontologies created in tandem – can be as complete as needed, lossless, long-lasting because flexible and responsive – big bang for buck – measurable benefit even from first small investments

Coordination through shared governance and training

29

slide-26
SLIDE 26

Main challenge: Will it scale?

The problem of scalability turns on

  • the ability to accommodate ever increasing

volumes and types of data and numbers of users

  • can we preserve coordination (consistency,

non-redundancy) as ever more domains become involved?

  • can we respond in agile fashion to ever

changing bodies of source data?

31

slide-27
SLIDE 27

Strategy for agile ontology creation

  • Identify or create carefully validated general

purpose plug-and-play reference ontology modules for principal domains

  • Develop a method whereby these reference
  • ntologies can be extended very easily to cope

with specific, local data through creation of application ontologies

32

slide-28
SLIDE 28

vehicle =def: an object used for transporting people or goods tractor =def: a vehicle that is used for towing crane =def: a vehicle that is used for lifting and moving heavy objects vehicle platform=def: means of providing mobility to a vehicle wheeled platform=def: a vehicle platform that provides mobility through the use of wheels tracked platform=def: a vehicle platform that provides mobility through the use of continuous tracks artillery vehicle = def. vehicle designed for the transport of one or more artillery weapons wheeled tractor = def. a tractor that has a wheeled platform Russian wheeled tractor type T33 =

  • def. a wheeled tractor of type T33

manufactured in Russia Ukrainian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Ukraine

Reference Ontology Application Ontology

slide-29
SLIDE 29

vehicle =def: an object used for transporting people or goods

tractor =def: a vehicle that is used for towing

crane =def: a vehicle that is used for lifting and moving heavy objects vehicle platform=def: means of providing mobility to a vehicle wheeled platform=def: a vehicle platform that provides mobility through the use of wheels tracked platform=def: a vehicle platform that provides mobility through the use of continuous tracks artillery vehicle = def. vehicle designed for the transport of one or more artillery weapons wheeled tractor = def. a tractor that has a wheeled platform Russian wheeled tractor type T33 =

  • def. a wheeled tractor of type T33

manufactured in Russia

Ukrainian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Ukraine

Reference Ontology Application Ontology

slide-30
SLIDE 30

Basic Formal Ontology (BFO) Extended Relation Ontology Time Ontology Quality Ontology Information Entity Ontology Geospatial Ontology Event Ontology Artifact Ontology Agent Ontology

slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34

40

http://milportal.org

slide-35
SLIDE 35

41

slide-36
SLIDE 36

42

slide-37
SLIDE 37

43

slide-38
SLIDE 38

An example of agile application

  • ntology development:

The Bioweapons Ontology (BWO)

44

slide-39
SLIDE 39

Kinds of chemical and biological weapons

Chemical

Nerve agents (sarin gas) Blister agents (mustard gas) Blood agents (cyanide gas)

Biological

Infectious agents – BWO(I) Toxic agents (botulinum toxin, ricin) – BWO(T)

45

slide-40
SLIDE 40

We focus here on BWO(I) Infectious agents

–Bacterial (anthrax, bubonic plague, tularemia, brucellosis, cholera …) –Viral (Ebola, Marburg …)

46

slide-41
SLIDE 41

BFO IDO StaphIDO Independent Continuant Infectious disorder

  • Staph. aureus

disorder Dependent Continuant Infectious disease Protective resistance MRSA Methicillin resistance Occurrent Infectious disease course MRSA course

Examples of ontology terms

47

slide-42
SLIDE 42

Infectious Disease Ontology (IDO)

IDO Core (Reference Ontology)

  • General terms in the ID domain.

IDO Extensions (Application Ontologies)

  • Disease-, host-, pathogen-specific.
  • Developed by subject matter experts.

The hub-and-spokes strategy ensures that logical content of IDO Core is automatically inherited by the IDO Extensions

  • with thanks to Lindsay Cowell (University of Texas SW

Medical Center) and Albert Goldfain (Blue Highway, Inc.)

slide-43
SLIDE 43

IDO Core

  • Contains general terms in the ID domain:

– E.g., ‘colonization’, ‘pathogen’, ‘infection’

  • A contract between IDO extension ontologies

and the datasets that use them.

  • Intended to represent information along

several dimensions:

– biological scale (gene, cell, organ, organism, population) – discipline (clinical, immunological, microbiological) – organisms involved (host, pathogen, and vector types)

slide-44
SLIDE 44

BFO IDO StaphIDO Independent Continuant Infectious disorder

  • Staph. aureus

disorder Dependent Continuant Infectious disease Protective resistance MRSA Methicillin resistance Occurrent Infectious disease course MRSA course

Examples of ontology terms

50

slide-45
SLIDE 45

IDO Extensions

IDO – Brucellosis IDO – Dengue Fever IDO – Influenza IDO – Malaria IDO – Staphylococcus Aureus Bacteremia IDO – Vector Surveillance and Management IDO – Plant VO – Vaccine Ontology BWO(I) – Bioweapons Ontology (Infectious Agents)

51

slide-46
SLIDE 46

How IDO evolves: the case of Staph. aureus

IDOCore IDOSa IDOHumanSa IDORatSa IDOStrep IDORatStrep IDOHumanStrep IDOMRSa IDOHumanBacterial IDOAntibioticResistant IDOMAL IDOHIV HUB and SPOKES: Domain

  • ntologies

SEMI-LATTICE: By subject matter experts in different communities of interest. IDOFLU

slide-47
SLIDE 47
slide-48
SLIDE 48

54

slide-49
SLIDE 49
slide-50
SLIDE 50

BWO:disease by infectious agent = def. a disease that is the consequence of the presence of pathogenic microbial agents, including pathogenic viruses, pathogenic bacteria, fungi, protozoa, multicellular parasites, and aberrant proteins known as prions

slide-51
SLIDE 51

Strategy used to build BWO(I)

with thanks to Lindsay Cowell and Oliver He (Michigan)

  • 1. Start with a glossary such as:

http://www.emedicinehealth.com/biological_warfare/

  • 2. Select corresponding terms from IDO core and

related ontologies such as the CHEBI Chemistry Ontology terms needed to describe bioweapons

  • 3. All ontology terms keep their original definitions

and IDs.

  • 4. The result is a spreadsheet

57

slide-52
SLIDE 52
  • 5. Where glossary terms have no ontology

equivalent, create BWO ontology terms and definitions as needed

58

no corresponding

  • ntology term
slide-53
SLIDE 53
  • 6. Use the Ontofox too to create the first version of

the BWO(I) application ontology (http://ontofox.hegroup.org/)

  • 7. Use BWO(I) in annotations, and where gaps are

identified create extension terms, for instance

– weaponized brucella – aerosol anthrax – smallpox incubation period

This establishes a virtuous cycle between ontology development and use in annotations

59

slide-54
SLIDE 54

Potential uses of BWO

– semantic enhancement of bioweapons intelligence data – results will be automatically interoperable with relevant bioinformatics and public health IT tools for dealing with infections, epidemics, vaccines, forensics, … –to annotate research literature and research data

  • n bioweapons

– to create computable definitions to substitute for definitions in free text glossaries

60

slide-55
SLIDE 55

Why do people think they need lexicons

  • Training
  • Compiling lessons learned
  • Compiling results of testing, e.g. of proposed new

doctrine

  • Collective inferencing
  • Official reporting
  • Doctrinal development
  • Standard operating procedures
  • Sharing of data
  • People need to (ensure that they) understand

each other