on the use of linked open data for trusting web data
play

On the Use of Linked Open Data for Trusting Web Data Davide Ceolin - PowerPoint PPT Presentation

On the Use of Linked Open Data for Trusting Web Data Davide Ceolin and Valentina Maccatrozzo VU University Amsterdam Outline Premises Introduction A Natural History Case Study A Cultural Heritage Case Study Future


  1. On the Use of Linked Open Data for Trusting Web Data Davide Ceolin and Valentina Maccatrozzo VU University Amsterdam

  2. Outline • Premises • Introduction • A Natural History Case Study • A Cultural Heritage Case Study • Future directions • Recap, bibliography, etc.

  3. Premises • Trust ≈ Reliability. • We make no assumption about the intentions of the data creator. • This presentation gives a reflection on past work (see refs. ) and outlines future directions.

  4. Introduction • Trust Management: subjective logic (Jøsang, 2001) • Extends boolean and probabilistic logic. • Reasoning on “opinions” about propositions based on evidence. • Accounts for source and uncertainty (inversely proportional to size of evidence set).

  5. Subjective logic: basics source ω proposition = (b, d, u) • b + d + u = 1 • b ≈ p(proposition) • u inversely proportional to evidence set. • operators: boolean, discounting, fusion...

  6. Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is available). • Trust is subjective. • We look for different “opinions” about the data. Subjective logic allows us to handle them. • First we estimate the data trustworthiness, then we select the “best” data (based, e.g. on author reputation).

  7. Using LOD to assist evidential reasoning • LOD provide lots of useful data. • More evidence. • Subjective logic’s distributions ≈ (At least some) LOD datasets distributions (Ceolin et al., 2011).

  8. Museums... Photo: flickr.com/clumsyjim

  9. Museums... ...have a problem. Photo: flickr.com/clumsyjim Photo: flickr.com/grrrl

  10. So they recruit some help... Photo: flickr.com/anirudhkoul

  11. Trusting Museum Annotations • Museums manage large collections. • Several Museums crowdsource annotations. • The quality and accuracy of annotations is crucial for their business. • Can they trust crowdsourced annotations?

  12. A Natural History Case Study specimen1 user1 aves xx ✓ specimen 2 user2 aves xyz ✗ ✓ specimen 3 user1 aves xz1 ✓ specimen 4 user2 aves yy specimen 5 user3 aves zz ✗

  13. A Natural History Case Study specimen1 user1 aves xx ✓ specimen 2 user2 aves xyz ✗ ✓ specimen 3 user1 aves xz1 ✓ specimen 4 user2 aves yy specimen 5 user3 aves zz ✗

  14. A Natural History Case Study specimen1 user1 aves xx ✓ tax author1 specimen 2 user2 aves xyz tax author2 ✗ ✓ specimen 3 user1 aves xz1 tax author1 ✓ specimen 4 user2 aves yy tax author1 specimen 5 user3 aves zz tax author2 ✗

  15. A Natural History Case Study specimen1 user1 aves xx ✓ tax author1 specimen 2 user2 aves xyz tax author2 ✗ ✓ specimen 3 user1 aves xz1 tax author1 ✓ specimen 4 user2 aves yy tax author1 specimen 5 user3 aves zz tax author2 ✗ Increased accuracy, from 53% to 82% on a museum dataset Ceolin et al., 2010

  16. Another Case Study Semantic similarity for weighing evidence. Training set Expertise Tulip New annotation Flower Rose Red Semantic Pink similarity Purple Up to: 84% accuracy, 88% precision, 96% Recall on two museum datasets (Ceolin et al. 2013a)

  17. Future work • We used similar methods (plus other statistical techniques) for analzying the reliability of UK Police Open Data (Ceolin et al., 2013b). • We plan to extend them with LOD, e.g. for: • geodisambiguation; • crime type hierarchies.

  18. Recap • LOD + Evidential reasoning (subjective logic) is a powerful combination for trust (reliability) estimation • enrichment; • weighing. • The more the better, but: • evidence quality counts; • data needs to be tracked (W3C PROV) and properly managed.

  19. Bibliography • Jøsang, A., A logic for uncertain probabilities. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3), pp. 279-311, 2001 • Ceolin, D. van Hage, W. R. Fokkink, W. A Trust Model to Estimate Quality of Annotations using the Web. In WebSci, Web Science Repository, 2010. • Ceolin, D., van Hage, W.R., Fokkink, W., Schreiber, G. Estimating Uncertainty of Categorical Web Data. In URSW, CEUR-ws.org, 2011. • Ceolin, D. Nottamkandath, A. Fokkink, W., Semi-automated Assessment of Annotations Trustworthiness. In PST Conference, IEEE, 2013 • Ceolin, D. Moreau, L. O'Hara, K. Schreiber, G. Sackley, A. Fokkink, W. van Hage, W.R. Shadbolt, N., Reliability Analyses of Open Government Data. In URSW, CEUR-ws.org, 2013 • Ceolin, D. Nottamkandath, A. Fokkink, W. Efficient Semi-automated Assessment of Annotations Trustworthiness In Journal of Trust Management, Springer. (Accepted, 2014)

  20. Thank you! Any question? d.ceolin@vu.nl

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend