FREME WEBINAR HELD FOR GALA, 28 APRIL 2016
A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES)
www.freme-project.eu
Presented by Tatjana Gornostaja (Tilde) and Felix Sasaki (DFKI / W3C Fellow)
A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL - - PowerPoint PPT Presentation
Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771 A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES) FREME WEBINAR HELD FOR GALA, 28
FREME WEBINAR HELD FOR GALA, 28 APRIL 2016
A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES)
www.freme-project.eu
Presented by Tatjana Gornostaja (Tilde) and Felix Sasaki (DFKI / W3C Fellow)
OVERVIEW
Cou Coupl pling ing Kn Know
ledge an and Lan d Languag guage vi via e a e-Ser Service vice Ec Ecos
ystem tem
Knowledge Language
Knowledge Language
Knowledge Language
Picture: coloringpageswallpaper.com
THE FREME PROJECT
digital content and (linked) data
data value chain
CURRENT STATE OF SOLUTIONS
Machine translation, terminology annotation, ... Linked data creation & processing
GAPS THAT HINDER BUSINESS:
in given and new enterprises”: technology influences job profiles
FREME TO THE RESCUE: ENRICHING DIGITAL CONTENT
Machine translation, terminology annotation, ... Linked data creation & processing
LT and LD as first class citizens on the Web
A SET OF INTERFACES* - DESIGN DRIVEN BY BUSINESS CASES
LT and LD for various user types: (application) developer, content architect, content author, … * Graphical interfaces * Software Interfaces
OVERVIEW
FREME FROM A TECHNICAL PERSPECTIVE
A framework for multilingual and semantic enrichment of digital content that provides access via a set of APIs and GUIs to six E- services.
related information;
formats; and
in the ePub format.
FREME FROM A TECHNICAL PERSPECTIVE
How to access FREME – several options:
documentation at http://api.freme-project.eu/doc/current/
see the documentation for installation instructions
https://github.com/freme-project/
commercial use
LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME
representing digital content and enrichment information in a format agnostic manner, based on the linked data stack;
e.g. for improving machine translation output;
e.g. to terminology named entities; and
FREME is built on outcomes of standard driving projects in FP7 in the area of linguist linked data: LIDER and FALCON
EXAMPLE API CALL
that enriches content with named entities.
EXAMPLE OUTPUT: USING NIF TO STORE CONTENT …
(1) <http://freme-project.eu/#char=0,29> (2) a nif:String , nif:Context , nif:RFC5147String ; (3) nif:beginIndex "0"^^xsd:int ; (4) nif:endIndex "29"^^xsd:int ; (5) nif:isString "Welcome to the city of Prague"^^xsd:string . 1) Identifying the content via a URI 2) Adding certain types from NIF* 3) Identifying the start offset of the content 4) Identifying the end offset of the content 5) Providing the string content itself.
* For More on NIF: see a dedicated tutorial http://de.slideshare.net/m1ci/nif-tutorial
… AND ENRICHMENT INFORMATION
(1) <http://freme-project.eu/#char=23,29> … (2) nif:anchorOf "Prague"^^xsd:string ; (3) nif:beginIndex "23"^^xsd:int ; (4) nif:endIndex "29"^^xsd:int ; (5) nif:referenceContext <http://freme-project.eu/#char=0,29> ; (6) itsrdf:taClassRef <http://dbpedia.org/ontology/City>. 1) Identifying the annotation via a URI 2) Providing the string content of the annotation 3) Identifying the start offset of the content 4) Identifying the end offset of the content 5) Relating the content to annotations 6) Enrichment with ITS 2.0 class information (“Prague” = a city)
SIMPLIFIED OUTPUT HELPS API DEVELOPERS TO CONSUME LINKED DATA
the output
http://dbpedia.org/resource/Prague,50.0878367932108,14.424132200 1241 For more infos on filtering, see http://api.freme-project.eu/doc/current/knowledge-base/filtering.html
FORMAT COVERAGE
formats, …
services
supported) output
USING E-TERMINOLOGY WITH HTML OUTPUT
<!DOCTYPE html> … <body> <p>Welcome to the city of Prague.</p> </body> … </html> <!DOCTYPE html> … <p>Welcome to the <span its-term="yes">city</span> of Prague. …</html>
Call of e-Terminology
TRANSLATING XLIFF CONTENT WITH E-TRANSLATION
...<trans-unit> <source>This is car</source> </trans-unit> ... <http://freme-project.eu/#char=0,13> nif:isString "This is a car"@en itsrdf:target "Dies ist ein Auto"@de .
Call of e-Translation
IMPROVING E-TRANSLATION OUTPUT VIA E-TERMINOLOGY
“The EU in brief. The EU is a unique economic and political partnership between 28 European countries that together cover much of the continent.” continent, partnership, briefing, economics, covering
Call of e-Terminology: detection of translation suggestions
De voorschriften in DE EU. De EU is een uniek partnerschap tussen politiek en economie in de Europese landen, die gezamenlijk 28 verpakking van het continent.
Call of e-Translation: improved output!
OVERVIEW
MOTIVATION
with entity recognition and term disambiguation
datasets
discoverable
new channels
TRANSLATOR SUPPORT
translation suggestions
look-up
Recognition
and visual contextual properties: descriptions, images, links to other resources…
CUSTOMER VALUE-ADD
formed between new content and existing knowledge resources
private Multilingual Linked Data Cloud
DBpedia
Proprietary dataset Translated Content
BUSINESS BENEFITS
employing the most appropriate terminology
sociable and discoverable content
can be validated by human and saved with content
looking for service differentiators and value add
CHALLENGE AND OPPORTUNITY: BIG DATA IS GROWING ACROSS LANGUAGES, SECTORS AND DOMAINS
Agriculture metadata, user content, news content, …
WHAT LIES AHEAD FOR SEVERAL INDUSTRIES? SEE THE FREME BUSINESS CASES
EN ES JA, ZH, ... AR
DIGITAL PUBLISHING
With a simple click you can fetch extra information from a dataset and use it to annotate content.
AGRICULTURE AND FOOD DATA
Domain experts can automatically extract terms from title, description, abstracts and full text.
PERSONALISATION OF WEB CONTENT
Businesses can identify the topics their customers are engaging with, focusing their global content strategy.
CONTACTS
E-mail: info@freme-project.eu felix.sasaki@dfki.de tatjana.gornostaja@tilde.com
CONSORTIUM
OVERVIEW