ONSTR: Ontology for Newborn Screening Follow-up and Translational - - PowerPoint PPT Presentation

onstr ontology for newborn screening follow up and
SMART_READER_LITE
LIVE PREVIEW

ONSTR: Ontology for Newborn Screening Follow-up and Translational - - PowerPoint PPT Presentation

ONSTR: Ontology for Newborn Screening Follow-up and Translational Research 2013 Joint Meeting of the Newborn Screening and Genetic Testing Symposium th Thursday, May 9 Atlanta Prabhu Shankar MD, MS Sneana Nikoli , MA, Sivaram Arabandi,


slide-1
SLIDE 1

ONSTR: Ontology for Newborn Screening Follow-up and Translational Research

Prabhu Shankar MD, MS Snežana Nikolić, MA, Sivaram Arabandi, MD, MS Shamakant Navathe Kunal Malhotra Rani H. Singh PhD, RD

2013 Joint Meeting of the Newborn Screening and Genetic Testing Symposium Thursday, May 9

th

Atlanta

slide-2
SLIDE 2

Objective

  • NBS and f

NBS and follow-up w llow-up workflows and da

  • ws and data comple

ta complexity xity

  • Ontology and Semantic W

Ontology and Semantic Web tec b technologies hnologies

  • ONSTR

ONSTR

  • Ne

Newborn Scr wborn Screening F ening Follow-up llow-up Da

Data Inte ta Integration ion Colla Collabor borativ tive (NBSD

(NBSDC)

  • Semantic W

Semantic Web tec b technology success stories in nology success stories in healthcar healthcare – – if time permits! time permits!

slide-3
SLIDE 3
slide-4
SLIDE 4

External Computational Support!

“OMICS”

slide-5
SLIDE 5

In Summary NBS Da NBS Data is: ta is:

  • Geog

Geographically distrib phically distribut uted ed (data silos!)

  • Inter

Intersects c ects clinical as w inical as well as many ll as many biomedical domains biomedical domains, e.g., biochemistry, pathways, metabolomics, genomics, proteomics, pharmacogenomics

  • Various f

rious forma rmats – s – str tructur uctural, sc l, schema hematic tic and semantic v and semantic varia riability bility

  • Of

Of r rare diseases! diseases!

5

slide-6
SLIDE 6

Semantic Web Technologies

"The Semantic Web pr provides a ides a common common fr frame amework k that allows data to be shared and reused across application, enterprise, and community boundaries.” Wo World W Wide We Web C Consortium ( (W3C)

…technology stack to support a “Web of data”

slide-7
SLIDE 7

Triples:

Resource Description Framework (RDF) Resource Description Framework (RDF)

Simple Simple, Dynami Dynamic, Extensib Extensible, Inter Interoper perable RDF Sc RDF Schema (RDFS), W hema (RDFS), Web Ontology Langu b Ontology Language ‘Ontolo ‘Ontology gy’ ’

PREDICATE

Amino Acid Phenylalanine is_a

SUBJECT OBJECT

material_basis Phenylketonuria PAH Mutation is_a Phenylketonuria Inherited Metabolic Disease

Relationships/Properties

Directed, edge-labeled graph Infers!

slide-8
SLIDE 8

What is ontology?

  • 1. A branch of philosophy, studying categories

and types of beings existing in the universe.

  • 2. In Informatics, explicit f

plicit formal specif rmal specifica cations tions of the terms in the domain and relationships among them.

  • Consensus based
  • Associated with documentation and definitions
  • Expr

Expressed essed in f in formal logic rmal logic to support automated reasoning

  • Interpretable by humans and computers
slide-9
SLIDE 9

Method

Definition Synonyms Classification (isa) Properties (has) Other relations Keywords Dictionary X Controlled vocabulary (X) X Thesaurus X X Taxonomy (X) X X Parrot is a bird (X) Parrot has a beak X Ontology X X X X

You can search by a term’s properties

X

Semantic Methods and Characteristics

Deanna Pennington, LTER DataBits, Spring 2006

slide-10
SLIDE 10

White matter White matter Corpus Callosum Periventricular Periventricular White Matter

has_MRI_finding

Developmental Neurotoxin

is_a

Large Neutral Amino Acids Sapropterin Sapropterin Phe Intake/Day Phe Intake/Day

has_drug has_diet_ analyses

Gene Analysis Plasma A Acids

has_diagnostic_ procedure Measured_by

is_a (hierarchical) Associative relations PAH Mutation

is_a is_a

Genetic Disease Autosomal Recessive Genetic Disease Inherited Metabolic Inherited Metabolic Disorder

PKU

Genetic Disease

is_a material_basis

Fair skin & hair eczema seizures cognitive performance

has_finding

Increased Blood Phe levels Relational?

Schema-flexible!

slide-11
SLIDE 11

Ontology Applications

Ontologies Information Integration Data Analysis

e.g., Gene Ontology(GO)

Data Exchange

e.g., Biological Pathway Exchange(BioPAX)

Knowledgebase

e.g., Foundational Model of Anatomy(FMA)

Naming “Things” Natural Language Processing

(NLP)

slide-12
SLIDE 12

Natural Language Understanding!

http://sc6.blogspot.com/2011/02/cartoon-of-week_20.html

slide-13
SLIDE 13
slide-14
SLIDE 14

An applica

plication ontology ion ontology representing the processes, entities

and knowledge in the Newborn Screening and follow-up system (Domain):

  • Newborn screening Dried Blood Spot (NDBS) covering

Inherited Metabolic Diseases (IMDs IMDs).

  • Genetic basis

Genetic basis of IMDs.

  • Positive tested cases follow-up practice including:

medical/c medical/clinical linical conf confirma irmato tory testing (bioc ry testing (biochemical emical and and molecular). molecular).

  • Medical

Medical and n and nutritional tritional tr trea eatment tment (dietary analysis monitoring)

  • Outcomes

Outcomes, e.g., phy physical sical and cognitive growth and

  • wth and

de development e lopment evalua aluation. tion.

  • Research related to IMDs and NBS.

What is ONSTR?

slide-15
SLIDE 15

Why are we building ONSTR?

  • To provide basis for standar

standardiza ization of ion of da data ta annotation in NBS domain.

  • To provide knowledg

knowledge base e base for

integrating, aggregating and reasoning over data collected from different NBS sources.

  • To de

develop lop tools

  • ols for knowledge and data

sharing to be used by greater IMD/NBS community.

slide-16
SLIDE 16

Not Alone……

Open Biomedical Ontologies (OBO) Foundry principles and framework.

slide-17
SLIDE 17

ONSTR building process

  • 1. Use case Def
  • 1. Use case Definition

nition

‘of all the diagnosis confirmed patients who were new born screening positive, between 2005-2010 2005-2010, matching age and matching mutation (R408W), did good nutritional ma utritional mana nagement ment VS K VS Kuvan + Nutritional Nutritional mana management ment had better outcom

  • utcome with

regards to MRI Whi MRI White ma e matter c tter chang anges s at five years?’.

  • 2. Identif
  • 2. Identifica

cation of tion of k key entities and r y entities and rela lationships tionships holding betw holding between these entities een these entities Methodology: Methodology:

  • Top Down and Bottom Up
  • Survey of relevant literature
  • Identifying the common data elements (CDEs)
  • Follow OBO Foundry best practices
slide-18
SLIDE 18

Top Down and Bottom Up

Common Data Elements (CDEs)

slide-19
SLIDE 19

All Common Data Elements (CDEs)

slide-20
SLIDE 20

Modeling with Relations

New Born Screening (NBS) Blood Tyrosine Concentration Blood Phenylalanine Concentration Newborn Finger Prick Blood Collection Procedure Heel Prick Blood Collection Procedure Blood Specimen Annotated Filter Paper Dried Blood Spot (DBS) Biomarker Concentration Biomarker Assay Genetic Testing Biochemical Testing Health Screening Procedure Procedure Is_a hasOutput Is_a part_of part_of part_of Is_a part_of Is_a Is_a collected_from

  • utPut_of

Specimen Blood Collection Procedure Is_a Is_a has_Participant

  • utPut_of

Specimen Container Is_a has_Participant Pulse Oximetry Testing part_of Dried Blood Spot Elute preparedFrom isTestSpeci menFor hasTestSpecimen BFO: Process Aggregate Is_a BFO: Process BFO: Process BFO: Process BFO: Process State Specific II Screening Hearing Testing part_of part_of Is_a

slide-21
SLIDE 21

ONSTR building process contd.

  • 3. Ontology coding
  • ONSTR is formally encoded as a RDF/XML seriali

RDF/XML serialization of ion of OWL2 (W3C semantic W L2 (W3C semantic Web b standar standards) s)

  • 4. Ontology integration
  • Mappings between ONSTR and other relevant
  • ntologies/vocabularies (Future work).
  • 5. Ontology evaluation
  • In progress, concomitant with ONSTR development.
  • 6. Ontology documentation
  • Available on the ONSTR project page:

http://code.google.com/p/onstr/source/docs )

slide-22
SLIDE 22

ONSTR graph and Logical Definitions

slide-23
SLIDE 23

ONSTR Statistics

Total n tal number of mber of c classes: 1842 lasses: 1842 ONSTR na ONSTR nativ tive c classes: 1100 asses: 1100 Impor Imported c ed classes: 742 asses: 742

slide-24
SLIDE 24

BioPortal

http://bioportal.bioontology.org/ontologies/49978

National Center for Biomedical Ontology (NCBO), Stanford University

slide-25
SLIDE 25

Ne Newborn Scr wborn Screening F eening Follow-up Da llow-up Data ta Inte Integration Colla ion Collabor borativ tive (NBSDC) (NBSDC)

slide-26
SLIDE 26

Acknowledgements:

Funded by:

  • 2009-10

2009-10 HSI HSI Seed eed Gr Grant ant, a Clinical Outcomes Research and Public Health (CORPH) Pilot Grants Program, jointly supported by Geor Georgia gia Tec ech and nd Childr Children’ en’s Healthcar Healthcare of Atlanta tlanta and

  • The Southeast

Southeast NBS BS & Genetics enetics Colla

  • llabor

borativ tive (SER (SERC) C) Grant from the Maternal and Child Health Bureau, HRSA HRSA Gr Grant ant U22MC10979. 22MC10979. Special thanks to:

  • Dr
  • Dr. Bar

Barry ry Smith Smith, National Center for Ontological Research (NCOR NCOR), University @ Buffalo, Buffalo.

slide-27
SLIDE 27

Thank You

27

Questions: PRSHANK@emory.edu

slide-28
SLIDE 28

Semantic technologies in action….

Cross‐Species Biomarkers

Reducing Animal Testing

Result: Semantic integration (large animals to small animals to cell culture) to discover cross‐species biomarkers applicable to human adverse events and diseases

Cour Courtes tesy: Eric rich Gombocz, VP Gombocz, VP & & CSO CSO, IO Inf IO Informa rmatics ics, Inc Inc.

slide-29
SLIDE 29

Semantic technologies in action…. Combination Treatment

Effectiveness in Prostate Cancer

Result: effectiveness comparison of different combination treatments based on multi‐ platform genomic and proteomic marker profiles and patient match

slide-30
SLIDE 30
slide-31
SLIDE 31

Tools already being developed…

slide-32
SLIDE 32

Challenges

  • Time consuming

me consuming

  • Domain knowledg

Domain knowledge, Multi-disciplinary , Multi-disciplinary

  • Compu

Computing Ca ing Capacity to pr pacity to process Gr

  • cess Graphs

hs

  • Sk

Skilled per illed personnel

  • nnel
  • Funding

Funding

  • Issues with da

Issues with data sharing ta sharing

  • Buy in

Buy in

  • Polic

licy

  • HIP

HIPAA

slide-33
SLIDE 33
slide-34
SLIDE 34

Interoperability

slide-35
SLIDE 35

Disorder Lab finding Disorder Disease

PAH deficiency

  • BH4 deficiency
  • abnormal BH4

Enzyme Phenylalanine Hydroxylase isAbout

Phenylalanine Level

  • datatype
  • units

Hyperphenylalaninemia = Phe level > 120 mmoles/L

PKU Level

Mild 360-900 Moderate 900-1200 Classic >1200

Phenylalanine isAbout hasNumericValue Plasma Phenylalanine level Blood Phenylalanine level CSF Phenylalanine level

Procedure

Plasma Phenylalanine level test Heel Prick Dried Blood spot test CSF Phenylalanine level test hasNumericValue Phe Level

110

hasNumericValue Mmol/L hasUnit Plasma Phenylalanine level transformation

  • utputOf
  • utputOf
  • utputOf

derivedFrom

PAH Mutations

slide-36
SLIDE 36

Big Picture