SLIDE 1
Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE - - PowerPoint PPT Presentation
Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE - - PowerPoint PPT Presentation
Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE WEB The Web is now 26 years old Evolution of the Web The Future of the Web? THE SEMANTIC WEB The Semantic Web what is the Semantic Web? Semantic Web?
SLIDE 2
SLIDE 3
The Web is now 26 years old
SLIDE 4
Evolution of the Web
SLIDE 5
The Future of the Web?
SLIDE 6
THE “SEMANTIC WEB”
SLIDE 7
The “Semantic Web”
… what is the “Semantic Web”?
SLIDE 8
Semantic Web?
semantic web
SLIDE 9
Semantic Web?
“The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry
- ut sophisticated tasks for users.”
─ Berners-Lee et al. (2001) “The Semantic Web”
- Sci. American. 284(5):34–43.
SLIDE 10
WHAT’S WRONG WITH THE CURRENT WEB?
SLIDE 11
The current Web is document-centric
SLIDE 12
The current Web is document-centric
SLIDE 13
(Most of it) Makes sense to humans
SLIDE 14
Not to machines
SLIDE 15
Not to machines
SLIDE 16
What machines on the Web can do
SLIDE 17
What machines on the Web can do
SLIDE 18
This (with some “tricks”) works really well
SLIDE 19
Can even get “direct answers” now
SLIDE 20
THE WEB IS GREAT … … WHAT’S THE PROBLEM …
SLIDE 21
At its core, Google is still just doing …
(… but really really well)
SLIDE 22
Let’s ask a question …
… what might the output be?
SLIDE 23
A structured question on structured data …
… what might the output be?
SLIDE 24
From a human perspective …
SLIDE 25
(1) Data, (2) Query, (3) Rules/Ontologies
SLIDE 26
THE SEMANTIC WEB: NOT JUST PURELY ACADEMIC
SLIDE 27
Hidden within the Web … let’s have a look
SLIDE 28
The Linked Data Cloud
- Oct. 2007
- Nov. 2007
- Feb. 2008
- Sep. 2008
- Mar. 2009
July 2009
- Sept. 2010
- Sept. 2011
SLIDE 29
Linked Government Data: data.gov
29
SLIDE 30
Linked Government Data: data.gov.uk
30
SLIDE 31
Linked Government Data: datos.gob.cl
SLIDE 32
Life Sciences
32
SLIDE 33
Life Sciences
33
SLIDE 34
New York Times Meta-data
34
http://data.nytimes.com/schools/schools.html
SLIDE 35
schema.org (Bing, Google, Yahoo!, Yandex)
35
SLIDE 36
Facebook Open Graph Protocol
SLIDE 37
Google’s Knowledge Graph
SLIDE 38
A MORE IN-DEPTH USE-CASE: WIKIDATA
SLIDE 39
What is Wikidata?
SLIDE 40
Problem 1: Different language versions manually edited by users
SLIDE 41
Problem 2: Complex lists of things manually edited by users
SLIDE 42
Solution: Wikidata
- Collaboratively edit structured data in one
place, with multi-lingual labels
SLIDE 43
Wikidata facts about Abraham Lincoln
SLIDE 44
STRUCTURING WEB DATA WITH RDF: RESOURCE DESCRIPTION FRAMEWORK
SLIDE 45
(1) Data, (2) Query, (3) Rules/Ontologies
SLIDE 46
RDF: Resource Description Framework
SLIDE 47
Modelling the world with triples
SLIDE 48
Concatenate to “integrate” new data
SLIDE 49
RDF often drawn as a (directed, labelled) graph
SLIDE 50
Set of triples thus called an “RDF Graph”
SLIDE 51
NAMING THINGS IN RDF: IRIS
SLIDE 52
Need unambiguous symbols/identifiers
- Since we’re on the Web … use Web identifiers
- URL: Uniform Resource Location
– The location of a resource on the Web – http://ex.org/Dubl%C3%ADn.html
- URI: Uniform Resource Identifier (RDF 1.0)
– Need not be a location, can also be a name – http://ex.org/Dubl%C3%ADn
- IRI: Internationalised Resource Identifier (RDF 1.1)
– A URI that allows Unicode characters – http://ex.org/Dublín
SLIDE 53
We will use IRIs with prefixes
- http://ex.org/Dublín ↔ ex:Dublín
- “ex:” denotes a prefix for http://ex.org/
- “Dublín” is the local name
SLIDE 54
Frequently used prefixes
SLIDE 55
From strings …
SLIDE 56
… to IRIs …
SLIDE 57
NAMING THINGS IN RDF: LITERALS
SLIDE 58
What about numbers?
Should we assign IRIs to numbers, etc.?
SLIDE 59
RDF allows “literals” in object position
- Literals are for datatype values, like strings,
numbers, booleans, dates, times
- Only allowed in object position
SLIDE 60
Datatype literals
- “lexical-value”^^ex:datatype
– “200”^^xsd:int – “2014-12-13”^^xsd:date – “true”^^xsd:boolean – “this is a string”^^xsd:string
- If the datatype is omitted, it’s a string
– “this is a string” – “200” is a string, not a number!
SLIDE 61
Many datatypes borrowed from XML Schema
SLIDE 62
Boolean datatype
SLIDE 63
Numeric datatypes
SLIDE 64
Temporal datatypes
SLIDE 65
Text/string datatypes
SLIDE 66
Language-Tagged Strings
- Specify that a string is in a given language
- “string”@lang-tag
- No datatype!
SLIDE 67
(NOT) NAMING THINGS IN RDF: BLANK NODES
SLIDE 68
Having to name everything is hard work
SLIDE 69
For this reason, RDF gives blank nodes
- Syntax: _:blankNode
- Represents existence of something
– Often used to avoid giving an IRI (e.g., shortcuts)
- Can only appear in subject or object position
- (More later)
SLIDE 70
RDF TERMS: SUMMARY
SLIDE 71
A Summary of RDF Terms
- 1. IRIs (Internationalised Resource Identifiers)
– Used to name generic things
- 2. Literals
– Used to refer to datatype values – Strings may have a language tag
- 3. Blank Nodes
– Used to avoid naming things – A little mysterious right now
SLIDE 72
MODELLING DATA IN RDF
SLIDE 73
Let’s model something in RDF …
Model the following in RDF: “Sharknado is the first movie of the Sharknado series. It first aired on July 11, 2013. The movie stars Tara Reid and Ian Ziering. The movie was followed by ‘Sharknado 2: The Second One’.
SLIDE 74
RDF Properties
- RDF Terms used as predicate
- rdf:type, ex:firstMovie, ex:stars, …
SLIDE 75
RDF Classes
- Used to conceptually group resources
- The predicate rdf:type is used to relate
resources to their classes
SLIDE 76
Modelling in RDF not always so simple
Model the following in RDF: “Sharknado stars Tara Reid in the role of ‘April Wexler’.
SLIDE 77
Modelling in RDF not always so simple
Model the following in RDF: “The first movie in the Sharknado series is ‘Sharknado’. The second movie is ‘Sharknado 2: The Second One’. The third movie is ‘Sharknado 3: Oh Hell No!’.
SLIDE 78
RDF Collections: Model Ordered Lists
- Standard way to model linked lists in RDF
- Use rdf:rest to link to rest of list
- Use rdf:first to link to current member
- Use rdf:nil to end the list
SLIDE 79
RDF Collections: Generic Modelling
- Not just for Sharknado series
SLIDE 80
RDF SYNTAXES: WRITING RDF DOWN
SLIDE 81
N-Triples
- Line delimited format
- No shortcuts
SLIDE 82
RDF/XML
- Legacy format
- Not intuitive
SLIDE 83
RDFa
- Embed RDF into HTML
- Not so intuitive
SLIDE 84
JSON-LD
- Embed RDF into JSON
- Not completely aligned with RDF
SLIDE 85
Turtle
- Readable format
SLIDE 86
Turtle: Collections Shortcut
SLIDE 87
BLANK NODES ADD COMPLEXITY
SLIDE 88
Blank nodes names aren’t important …
(Isomorphic)
SLIDE 89
Blank nodes are local identifiers
How should we combine these two RDF graphs?
SLIDE 90
Need to perform an RDF merge
How should we combine these two RDF graphs?
SLIDE 91
Are two RDF graphs the “same”?
(Isomorphic)
SLIDE 92
Are two RDF graphs the “same”?
SLIDE 93
RECAP
SLIDE 94
(1) Data, (2) Query, (3) Rules/Ontologies
SLIDE 95
RDF: Resource Description Framework
SLIDE 96
RDF = Resource Description Framework
- Structure data on the Web!
- RDF based on triples:
– subject, predicate, object – A set of triples is called an RDF graph
- Three types of RDF terms:
– IRIs (any position) – Literals (object only; can have datatype or language) – Blank nodes (subject or object)
SLIDE 97
RDF = Resource Description Framework
- Modelling in RDF:
– Describing resources – Classes and properties form core of model – Try to break up higher-arity relations – Collections: standard way to model order/lists
- Syntaxes:
– N-Triples: simple, line-delimited format – RDF/XML: legacy format, horrible – RDFa: embed RDF into HTML pages – JSON-LD: embed RDF into JSON – Turtle: designed to be human friendly
SLIDE 98
RDF = Resource Description Framework
- Two operations on RDF graphs:
– Merging: keep blank nodes in source graphs apart – Are they the “same” modulo blank node labels: isomorphism check!
SLIDE 99