Integrated Semantic Search on Structured and Unstructured Data in - - PowerPoint PPT Presentation

integrated semantic search on structured and unstructured
SMART_READER_LITE
LIVE PREVIEW

Integrated Semantic Search on Structured and Unstructured Data in - - PowerPoint PPT Presentation

Integrated Semantic Search on Structured and Unstructured Data in the ADOnIS System Friederike Klan, Erik Faessler,Alsayed Algergawy, Birgitta Knig-Ries, and Udo Hahn The AquaDiva Project model of water fmow CSV throughfall and stemfmow


slide-1
SLIDE 1

Integrated Semantic Search on Structured and Unstructured Data in the ADOnIS System

Friederike Klan, Erik Faessler,Alsayed Algergawy, Birgitta König-Ries, and Udo Hahn

slide-2
SLIDE 2

CSV CSV CSV

meteorological data DNA sequencing data mass spectrometry data throughfall and stemfmow data model of water fmow

The AquaDiva Project

slide-3
SLIDE 3

Syntactic Search in

http://bexis2.uni-jena.de/

slide-4
SLIDE 4

Example

Search data referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm 23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,4 18 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,3 4 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,8 7 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1 12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,3 35 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,1 5 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,2 8 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3 9 S3 30/10/2011 15:30 7,1 0,007 9,3 8,7 23,5 17 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

slide-5
SLIDE 5

Example

Search data referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm 23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,4 18 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,3 4 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,8 7 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1 12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,3 35 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,1 5 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,2 8 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3 9 S3 30/10/2011 15:30 7,1 0,007 9,3 8,7 23,5 17 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

alkaline is a Thing which has a pH value > 7

slide-6
SLIDE 6

Semantic Search

Search keywords

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm 23 H43 12/10/2011 15:00 7,1 0,01 0,006 9,2 8,9 2,4 18 H41 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,3 4 H51 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,8 7 H43 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1 12 H41 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,3 35 H51 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,1 5 H51 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,2 8 H51 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3 9 H43 30/10/2011 15:30 7,1 0,007 9,3 8,7 23,5 17 H41 01/11/2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

S2 S1 located in

Semantic Annotation Data

Bechstedter Grund alkaline is ...

Knowledge Base

reasoner

slide-7
SLIDE 7

User Interface

slide-8
SLIDE 8

System Overview

Click to add Title

MD DS PD MD PD TBox (dom ain knowledge) Virtual ABox (Ontop) SA Mappings PubMed/Medline PubMed Central Publication Search Publication Annotator ( extraction of class mentions, nam ed entities and relations) Query Translation SPARQL Endpoint Quest Reasoner (Ontop) J CoRe GeNO LINNEAUS BioSem BExIS 2 Search Interface

( including autocompletion)

BExIS 2 Module ADOnIS

slide-9
SLIDE 9

Semantic Search on Structured Data

slide-10
SLIDE 10

Knowledge Base (TBox)

  • boe-

temporal

  • boe-core:ObservationCollection

ObservationType

  • boe-core:Observation
  • boe-core:Measurement

EntityType MeasurementType

  • boe-core:Precision
  • boe-core:Standard
  • boe-core:Characteristic

any

hasPrecision usesStandard

  • f

C h a r a c t e r i s t i c refersToMeasurementType refersToEntityType hasObservationType hasValue

  • fEntity

hasMeasurementType hasEntityType hasMeasurement refersToObservationType

  • boe-core: Entity

hasMember

EntityTypeClass CharacteristicClass ChEBI-light- module

  • boe-

chemistry ENVO OBI NCIT- module

subClassOf subClassOf

based on the Extensible Observation Ontology (OBOE)

slide-11
SLIDE 11

Semantic Annotation

SF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m 212 56 Birke 1,56 0,48 26,34 35 28 Eiche 1,19 0,35 24,86 62 34 Eiche 1,55 0,52 28,46 43 96 Buche 1,43 0,43 31,72 334 12 Buche 1,57 0,49 31,87

1,57 31,87 334

  • boe-core:ObservationCollection
  • boe-core:Observation
  • boe-core:Observation
  • boe-core:Characteristic
  • boe-core:Characteristic
  • boe-core:Characteristic
  • boe-core: Entity
  • boe-core: Entity
slide-12
SLIDE 12

Semantic Annotation

SF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m 212 56 Birke 1,56 0,48 26,34 35 28 Eiche 1,19 0,35 24,86 62 34 Eiche 1,55 0,52 28,46 43 96 Buche 1,43 0,43 31,72 334 12 Buche 1,57 0,49 31,87

1,57 31,87 334

  • boe-core:ObservationCollection
  • boe-core:Observation
  • boe-core:Observation
  • boe-core:Characteristic
  • boe-core:Characteristic
  • boe-core:Characteristic
  • boe-core: Entity
  • boe-core: Entity

ABox

slide-13
SLIDE 13

SF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m 212 56 Birke 1,56 0,48 26,34 35 28 Eiche 1,19 0,35 24,86 62 34 Eiche 1,55 0,52 28,46 43 96 Buche 1,43 0,43 31,72 334 12 Buche 1,57 0,49 31,87

334

Virtual ABox

CHARACTERISTIC-TYPE :crct_{crct_id} a <{crct}> . SELECT DISTINCT crct, chrct_id FROM annotation MEASUREMENT_TYPE-and-VALUE :msmt_{value} a oboe-core:Measurement;

  • boe-core:hasValue {value}^^xsd:string .

SELECT DISTINCT value FROM measurement_values

MAPPINGS

1,57 31,87 334

SPARQL Endpoint Quest Reasoner + Annotation (entity, characteristic, standard)

Virtual Abox using

slide-14
SLIDE 14

Semantic Annotation

Virtual ABox

CHARACTERISTIC-TYPE :crct_{crct_id} a <{crct}> . SELECT DISTINCT crct, chrct_id FROM annotation MEASUREMENT_TYPE-and-VALUE :msmt_{value} a oboe-core:Measurement;

  • boe-core:hasValue {value}^^xsd:string .

SELECT DISTINCT value FROM measurement_values

MAPPINGS

measurement_values (materialized view) msmt instance id value id literal annotations dataset column id entity characteristic standard id IRI entity IRI characteristic IRI measurement standard ... :crct_1 a <http://../someontology.owl#Volume> :crct_2 a <http://../someontology.owl#Circumference> :crct_3 a <http://../someontology.owl#Height> ... :msmt_{334} a oboe-core:Measurement;

  • boe-core:hasValue "334"^^xsd:string

:msmt_{1_57} a oboe-core:Measurement

  • boe-core:hasValue "1,57"^^xsd:string

:msmt_{31_87} a oboe-core:Measurement

  • boe-core:hasValue "31,87"^^xsd:string

...

slide-15
SLIDE 15

From keywords to SPARQL

Search

groundwater concentration_of nitrate

EntityTypeClass CharacteristicClass EntityTypeClass

MAPPINGS

SELECT DISTINCT ?dset WHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Nitrate>. ?obstype ad:refersToMeasurementType ?meastype. ?meastype ad:ofCharacteristic ?crct. ?crct a <http://../someontology.owl#concentration_of> } SELECT DISTINCT ?dset WHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Groundwater> }

slide-16
SLIDE 16

More Supported Queries

Show me all datasets that refer to measurements of the nitrate concentration in groundwater. At which locations within the Hainich transect has soil moisture been measured? On which dates has the concentration of nitrate been measured at well H31? Has species XY been observed in aquifer moTK? Which characteristics have been measured at H31?

slide-17
SLIDE 17

Semantic Search on Unstructured Data

slide-18
SLIDE 18

Publication Search with Semedico

slide-19
SLIDE 19

Publication Search with Semedico

slide-20
SLIDE 20

http://www.aquadiva.uni-jena.de/ http://www.semedico.org/ http://bexis2.uni-jena.de/ Contact: friederike.klan@uni-jena.de http://fusion.cs.uni-jena.de/fusion/members/friederike-klan/ This is joint work with my colleagues Alsayed Algergawy, Erik Faessler, Birgitta König-Ries, Udo Hahn, Roman Gerlach, Javad Chamanara, David Schöne, Sven Thiel, Martin Hohmuth, Thorsten Hindermann, Nafiseh Navabpour, Markus Steinberg, Valentin Wesp