When and Why to use a Classifier? When and Why to use a Classifier? - - PowerPoint PPT Presentation

when and why to use a classifier when and why to use a
SMART_READER_LITE
LIVE PREVIEW

When and Why to use a Classifier? When and Why to use a Classifier? - - PowerPoint PPT Presentation

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector with acknowledgement to with acknowledgement to Jeremy Rogers, Pieter Zanstra Zanstra, & the GALEN Consortium , & the GALEN Consortium


slide-1
SLIDE 1

1

OpenGALEN

When and Why to use a Classifier? When and Why to use a Classifier?

Alan Rector Alan Rector

with acknowledgement to with acknowledgement to Jeremy Rogers, Pieter Jeremy Rogers, Pieter Zanstra Zanstra, & the GALEN Consortium , & the GALEN Consortium Nick Drummond, Matthew Horridge, Hai Wang in CO Nick Drummond, Matthew Horridge, Hai Wang in CO-

  • ODE/

ODE/HyOntUSE HyOntUSE Information Management Group Dept of Computer Science, U Manches Information Management Group Dept of Computer Science, U Manchester ter Holger Knublauch, Ray Holger Knublauch, Ray Fergerson Fergerson, … and the Protégé , … and the Protégé-

  • Owl Team

Owl Team

rector@cs.man.ac.uk rector@cs.man.ac.uk co co-

  • ode
  • de-
  • admin@cs.man.ac.uk

admin@cs.man.ac.uk

www.co www.co-

  • ode.org
  • de.org

protege.stanford.org protege.stanford.org www.opengalen.org www.opengalen.org

slide-2
SLIDE 2

2

OpenGALEN

When to use a classifier When to use a classifier

  • 1. At author time: As a compiler

– Ontologies will be delivered as “pre-coordinated” ontologies to be used without a reasoner – To make extensions and additions quick, easy, and responsive, distri but developments, empower users to make changes – Part of an ontology life cycle

  • 2. At delivery time: As a service:

– Many fixed ontologies are too big and too small

  • Too big to find things; too small to contain what you need

– Create them on the fly

– Part of an ontology service

  • 3. At application time: as a reasoner

– Decision support, query optimisation, schema integration, …, …, … – Part of a reasoning service

slide-3
SLIDE 3

3

OpenGALEN

When to use a classifier 1: When to use a classifier 1: Pre Pre-

  • coordinated delivery

coordinated delivery classifier as compiler classifier as compiler

  • The life cycle

– Gather requirements, sketch, experiment – Establish patterns – design a “language”

  • Criteria for success: What a subject domain expert can learn in a few days

– Bulk authoring – Classification – Quality assurance – Commit classifier results to a pre-coordinated ontology & deliver

  • Polyhierarchies (Protégé, DAG-Edit, OWL-Lite, RDF(S), Topic Maps, …

– Query and use with you favourite tool

Development & evolution

slide-4
SLIDE 4

4

OpenGALEN

Commit Results to a Pre Commit Results to a Pre-

  • Coordinated

Coordinated Ontology Ontology

Assert (“Commit”) changes inferred by classifier

slide-5
SLIDE 5

5

OpenGALEN

When to use a classifier 2: When to use a classifier 2:

Post Coordination Post Coordination Classifier as an inference engine Classifier as an inference engine

  • When the ontology too big – “Lazy

classification” on demand Big on the outside: small kernel on the inside Big on the outside: small kernel on the inside

kernel model

Externally available resource

API

slide-6
SLIDE 6

6

OpenGALEN

Often combined with other services: Often combined with other services: Example Example -

  • the GALEN Server

the GALEN Server

Service API

Multilingual Lexicons

Multilingual Module

Ontologies & Classifier

Ontology Module

Resource References

External Resources & “Coding” Module

Client Applications Users

Ontology Services Client

Run time classifier Run time classifier

slide-7
SLIDE 7

7

OpenGALEN

… …but… Why use a Classifier? but… Why use a Classifier?

  • To compose concepts

– Allow conceptual lego

  • To manage polyhierarchies

– Adding abstractions (“axes”) as needed – Normalisation

  • Untangling

– labelling of “kinds of is-a”

  • To avoid combinatorial explosions

– Keep bicycles from exploding

  • To manage context

– Cross species, Cross disciplines, Cross studies

  • To check consistency and help users find errors
slide-8
SLIDE 8

8

OpenGALEN

Logic Logic-

  • based Ontologies:

based Ontologies: Conceptual Lego Conceptual Lego

gene hand extremity body protein cell expression acute chronic infection inflammation Lung bacterial abnormal normal deletion polymorphism ischaemic

slide-9
SLIDE 9

9

OpenGALEN

Logic Logic-

  • based Ontologies:

based Ontologies: Conceptual Lego Conceptual Lego

“SNPolymorphism of CFTRGene causing Defect in MembraneTransport of ChlorideIon

causing Increase in Viscosity of Mucus in CysticFibrosis…” “Hand which is anatomically normal”

slide-10
SLIDE 10

10

OpenGALEN

Linking taxonomies: Linking taxonomies: Conceptual Lego Conceptual Lego Normalisation

Species

Genes

Normalisation

Protein Function Disease Protein coded by (CFTRgene & in humans) Membrane transport mediated by (Protein coded by (CFTRgene in humans)) Disease caused by (abnormality in (Membrane transport mediated by (Protein coded by (CTFR gene & in humans)))) CFTRGene in humans

slide-11
SLIDE 11

11

OpenGALEN

Logic Based Ontologies: The basics Logic Based Ontologies: The basics

Validating (constraining cross products) Primitives Descriptions Definitions Reasoning

Thing Encrustation

+ involves: MitralValve

Thing

+ feature: pathological

Structure

+ feature: pathological + involves: Heart

Structure Heart MitralValve Encrustation MitralValve

* ALWAYS partOf: Heart

Encrustation

* ALWAYS feature: pathological

Feature pathological red

+ (feature: pathological)

red + partOf: Heart red + partOf: Heart

slide-12
SLIDE 12

12

OpenGALEN

Example demonstrations: Example demonstrations:

Take a Few Simple Concepts & Properties Take a Few Simple Concepts & Properties

slide-13
SLIDE 13

13

OpenGALEN

Combine them in Descriptions Combine them in Descriptions which can be simple…. which can be simple…. Sickle cell disease is a disease caused Sickle cell disease is a disease caused some some sickling sickling haemoglobin haemoglobin

slide-14
SLIDE 14

14

OpenGALEN

  • r which can be as complex as you like
  • r which can be as complex as you like

Cytstic Cytstic fibrosisis fibrosisis is caused by some non is caused by some non-

  • normal ion transport that is the function of

normal ion transport that is the function of a protein coded for by a CFTR gene a protein coded for by a CFTR gene

slide-15
SLIDE 15

15

OpenGALEN

Add some definitions Add some definitions “ “Diseases linked to CFTR Genes” Diseases linked to CFTR Genes”

slide-16
SLIDE 16

16

OpenGALEN

We have built a simple tree We have built a simple tree easy to maintain easy to maintain

slide-17
SLIDE 17

17

OpenGALEN

Let the classifier organise it Let the classifier organise it

slide-18
SLIDE 18

18

OpenGALEN

If you want more abstractions, If you want more abstractions, just add new definitions just add new definitions

(re (re-

  • use existing data)

use existing data) “Diseases linked to abnormal proteins”

slide-19
SLIDE 19

19

OpenGALEN

And let the classifier work again And let the classifier work again

slide-20
SLIDE 20

20

OpenGALEN

And again And again – – even for a quite different even for a quite different category category

“Diseases linked genes described in the mouse”

slide-21
SLIDE 21

21

OpenGALEN

And let classifier check consistency And let classifier check consistency

(My first try wasn’t) (My first try wasn’t)

slide-22
SLIDE 22

22

OpenGALEN

Represent context and views by Represent context and views by variant properties variant properties

Organ Heart Pericardium OrganPart CardiacValve is_part_of is_structurally_part_of is_clinically_part_of

Disease of (Heart or part-of-heart) Disease of Pericardium

slide-23
SLIDE 23

23

OpenGALEN

Integration of Contexts in Protégé Integration of Contexts in Protégé-

  • OWL

OWL

Generic part-of FMA Functional Clinical

FMA

Functional Clinical ‘Internal’

slide-24
SLIDE 24

24

OpenGALEN

Protégé Protégé-

  • OWL alternative views

OWL alternative views

Disorder of “Clinical heart” “Disorder of heart of any part of the heart” (including clinical and functional parts) Disorder of “FMA heart” “Disorder of heart or any structural part of the heart”

slide-25
SLIDE 25

25

OpenGALEN

Consequences for classification of Consequences for classification of diseases diseases

Doctors consider it a heart disease, even though developmentally/structurally it is not Disorders of the things anatomists recognise as parts

  • f the heart
slide-26
SLIDE 26

26

OpenGALEN

Summary: Why Classify? Summary: Why Classify?

  • To compose concepts
  • To untangle polyhierarchies – to Normalise
  • To avoid combinatorial explosions
  • To manage context
  • To check consistency and help users find errors
slide-27
SLIDE 27

27

OpenGALEN

Summary: When to Classify? Summary: When to Classify?

Applications do not need a classifier Applications do not need a classifier to benefit from classification to benefit from classification

  • Pre-coordination

– If concepts/terms can be predicted – When classifier is not available at run time – When we must fit with legacy applications

  • Post-coordination

– When a a few concepts are needed from a large potential set – When a classifier is available

  • and time cost is acceptable

– When applications can be built or adapted to take advantage

slide-28
SLIDE 28

28

OpenGALEN

Remember: Remember: Think about the Life Cycle Think about the Life Cycle

  • The life cycle for pre-coordinated ontologies

– Gather requirements, sketch, experiment – Establish patterns – design a “language”

  • What a subject domain expert can learn in a few days

– Bulk authoring – Classification – Quality assurance – Commit classifier results to a pre-coordinated ontology & deliver

  • Taxonomies (Protégé, DAG-Edit, OWL-Lite, RDF(S), Topic Maps, …

– Query and use with you favourite tool

Development & evolution

www.co www.co-

  • ode.org
  • de.org