Open Data Heterogeneity, Quality and Scale Presentation of the Open - - PowerPoint PPT Presentation

open data heterogeneity quality and scale
SMART_READER_LITE
LIVE PREVIEW

Open Data Heterogeneity, Quality and Scale Presentation of the Open - - PowerPoint PPT Presentation

Open Data Heterogeneity, Quality and Scale Presentation of the Open Data Research Group Zohra Bellahsene, Anne Laurent, Franois Scharffe, Konstantin Todorov PhD students : Manel Achichi, Mohamed Ben Ellefi, Abdel Nasser Tigrine Master students:


slide-1
SLIDE 1

Open Data Heterogeneity, Quality and Scale

Presentation of the Open Data Research Group Zohra Bellahsene, Anne Laurent, François Scharffe, Konstantin Todorov

PhD students : Manel Achichi, Mohamed Ben Ellefi, Abdel Nasser Tigrine Master students: Imène Chentli, Mykael Vigo

LIRMM / University of Montpellier

July 2015

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 1 / 29

slide-2
SLIDE 2

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 2 / 29

slide-3
SLIDE 3

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 3 / 29

slide-4
SLIDE 4

The Big Picture

The Open Data Research Group

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 4 / 29

slide-5
SLIDE 5

The Big Picture

The Open Data Research Group

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 5 / 29

slide-6
SLIDE 6

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 6 / 29

slide-7
SLIDE 7

Ontology Matching

Borrowed from a tutorial by S. Staab and A. Hotho.

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 7 / 29

slide-8
SLIDE 8

Ontology Matching

A Generic Framework for Ontology Matching and Evaluation

Ontologies are created in a decentralized, strongly human biased manner. Many ontologies describing the same domain of interest => ontology heterogeneity:

  • syntactic
  • terminological
  • conceptual / structural

=> Ontology Matching: detect the semantic correspondences between the elements

  • f two ontologies.
  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 8 / 29

slide-9
SLIDE 9

Ontology Matching

A Generic Framework for Ontology Matching and Evaluation

[Ngo, Bellahsene, Todorov. ESWC 2013]

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 9 / 29

slide-10
SLIDE 10

Ontology Matching

YAM++ (not) Yet Another Matcher

Many matching systems are out there. Here are some of the pluses of YAM++:

  • Automatic configuration: similarity measures selection, tuning, and combination
  • A novel terminological measure based on Tversky’s similarity
  • Able to deal with large ontologies

[Ngo, Bellahsene, EKAW 2012], [http://oaei.ontologymatching.org]

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 10 / 29

slide-11
SLIDE 11

Ontology Matching

YAM++ (not) Yet Another Matcher

Among the best performing systems in the current state-of-the-art (Cf. reports of the Ontology Alignment Evaluation Initiative (OAEI)1)

1http://oaei.ontologymatching.org

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 11 / 29

slide-12
SLIDE 12

Ontology Matching

A Fuzzy Framework for Ontology Matching

Consider the (inherently) vague nature of concepts and their alignemnts

  • Provide the missing implicit background knowledge
  • Most matching procedures produce 1:1 mappings: often we will not be interested in the

best (exact) match, but would like to find related yet not equivalent concepts

  • A fuzzy set representation of the concepts, construction of a fuzzy common ontology
  • Infer (fuzzy) relations between cross-ontology concepts

[Todorov, Hudelot, Popescu, Geibel. IJUFKS 2014]

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 12 / 29

slide-13
SLIDE 13

Ontology Matching

Cross-lingual Ontology Matching Motivation

  • No one-to-one correspondence between the majority of terms across different languages
  • Machine translation still tolerates low precision levels
  • Machine learning ? – No large training corpora with OM data

Use of background knowledge PhD project of Abdel Nasser Tigrine

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 13 / 29

slide-14
SLIDE 14

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 14 / 29

slide-15
SLIDE 15

Data Lifting and Linking

The Datalift projet

[Scharffe et al. 2012, http://datalift.org]

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 15 / 29

slide-16
SLIDE 16

Data Lifting and Linking

A General Data Linking Framework

[Ferrara, Nikolov, Scharffe. IJSW 2011]

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 16 / 29

slide-17
SLIDE 17

Data Lifting and Linking

The DOREMUS project Semantic web technologies for application-oriented use and reuse of musical data Leading cultural partner institutions: BnF, Radio France, Philharmonie de Paris Collaboration with Eurecom (Nice). PhD project of Manel Achichi

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 17 / 29

slide-18
SLIDE 18

Data Lifting and Linking

The DOREMUS project

Entity of interest: a musical work: — physical manifestations (recordings, scores) and — all the events that define them (creation, publication, performance). — relations between works — relations between events

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 18 / 29

slide-19
SLIDE 19

Data Lifting and Linking

The DOREMUS project

www.doremus.org

Applications: tools in support of the selection of musical works, able to suggest original musical programming for specialized radia, choosing works and interpretations to illustrate the biography of a composer, a historical period, culture or genre.

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 19 / 29

slide-20
SLIDE 20

Data Lifting and Linking

Lifting data from Tweets

TEWS: Twitter Events on the Semantic Web: extraction and modeling of events from the Twitter stream Use of the Wikitimes ontology for events representation. Master’s project of Mykael Vigo

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 20 / 29

slide-21
SLIDE 21

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 21 / 29

slide-22
SLIDE 22

Dataset Recommendation for Linking

...Any candidates? Towards an automatic discovery and recommendation of candidate datasets for linking PhD project of Mohamed Ben Ellefi

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 22 / 29

slide-23
SLIDE 23

Dataset Recommendation for Linking

Given a dataset d, return a (possibly) ranked set of WoD datasets with respect to their relevance to the dataset d in view of the linking task. Towards dataset profiling: definition of a collection of characteristics that allow to

  • describe in the best possible way a dataset
  • separate this dataset in the best possible way from other datasets
  • many (statistical) characteristics of interest (scale, coverage, data values range, degree of

connectedness, attribute entropy, etc...)

Collaboration avec L3S Hannover.

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 23 / 29

slide-24
SLIDE 24

Vocabulary Recommendation with Datavore The Datalyse project

Data modeling with Datavore, the data vocabulary recommender.

http://www.datalyse.fr

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 24 / 29

slide-25
SLIDE 25

Outline

1 The Big Picture 2 Ontology Matching 3 Data Lifting and Linking 4 Dataset and Vocabulary Recommendation 5 Data Access

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 25 / 29

slide-26
SLIDE 26

Data Access

The AgroLD project (Agronomic Linked Data)

Collaboration with IBC—the Institute of Computational Biology (Montpellier). Master’s project of Imène Chentli

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 26 / 29

slide-27
SLIDE 27

The Big Picture

The Open Data Research Group

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 27 / 29

slide-28
SLIDE 28

References

[Bizer, Heath, Bernes-Lee. IJWS 2009] Christian Bizer, Tom Heath, Tim Berners-Lee: Linked Data - The Story So Far. Int. J. Semantic Web Inf.

  • Syst. 5(3): 1-22 (2009)

[Ferrara, Nikolov, Scharffe. IJSW 2011] Alfio Ferrara, Andriy Nikolov, Franois Scharffe: Data Linking for the Semantic Web. Int. J. Semantic Web

  • Inf. Syst. 7(3): 46-76 (2011)

[Ngo, Bellahsene, EKAW 2012] DuyHoa Ngo, Zohra Bellahsene: YAM++ : A Multi-strategy Based Approach for Ontology Matching Task. EKAW 2012: 421-425 [Ngo, Bellahsene, Todorov. ESWC 2013] DuyHoa Ngo, Zohra Bellahsene, Konstantin Todorov: Opening the Black Box of Ontology Matching. ESWC 2013: 16-30 [Nikolov et al. JIST 2011] Andriy Nikolov, Mathieu d’Aquin, Enrico Motta: What Should I Link to? Identifying Relevant Sources and Classes for Data Linking. JIST 2011: 284-299 [Scharffe et al. AAAI 2012] Franois Scharffe, Ghislain Atemezing, Raphal Troncy, Fabien Gandon, Serena Villata, Bndicte Bucher, Fayal Hamdi et al. Enabling linked-data publication with the datalift platform. In Proc. AAAI workshop on semantic cities. 2012. [Todorov, Hudelot, Popescu, Geibel. IJUFKS 2014 (in print)] Konstantin Todorov, Celine Hudelot, Adrian Popescu, Peter Geibel. Fuzzy Ontology Alignment Using Background Knowledge. Intl. Journal on Uncertainty, Fuzziness and Knowledge-Based Systems. 2014.

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 28 / 29

slide-29
SLIDE 29

Thank you for listening!

  • Z. Bellahsene, A. Laurent, F

. Scharffe, K. Todorov 29 / 29