Linking Localisation and Language Resources Dominic Jones, David - - PowerPoint PPT Presentation

linking localisation and language resources
SMART_READER_LITE
LIVE PREVIEW

Linking Localisation and Language Resources Dominic Jones, David - - PowerPoint PPT Presentation

Linking Localisation and Language Resources Dominic Jones, David Lewis, Alexander OConnor, Leroy Finn Post-doctoral Researcher, Knowledge & Data Engineering Group, Center for Next Generation Localisation (CNGL) Trinity College Dublin 2


slide-1
SLIDE 1

Linking Localisation and Language Resources

Dominic Jones, David Lewis, Alexander O’Connor, Leroy Finn Post-doctoral Researcher, Knowledge & Data Engineering Group, Center for Next Generation Localisation (CNGL) Trinity College Dublin

slide-2
SLIDE 2

2

Creators, consumers & the Web

Addressing the challenges of dynamic user-generation content localisation.

slide-3
SLIDE 3

3

Linked Localisation - Interoperability Translation Memory (TM) Access Control (LD vs. LOD) Term Bases (TB) Provenance Machine Translation (MT)

Training + retraining

Content

Language Service Provider (LSP) Computer-Aided Translation (CAT)

slide-4
SLIDE 4

4

Linked Localisation – Architecture

Machine Translation Translation Memory & Term Bases Post Editing Human Translation & Segmentation Provenance

Visualizer

CNGL Service Model CNGL Content Model

XLIFF

Translation Between, Creator, LSP and Translator …

slide-5
SLIDE 5

5

Motivation:

Vs. CMS-LION emphasises on “user-generated” content providing:

  • The history of a piece of translation.
  • Revision and version control on the translation process.
  • Quality Assurance (QA) data on translated content.
  • Consensus on the translation process.
  • Syncronisation between all agents involved in the translation process.

http://www.tcd.ie/Library/bookofkells/ http://www.apple.com/ipad/

slide-6
SLIDE 6

6

Worked Example:

Job1 This is sentence 1. This is sentence 2. This is sentence 3.

Job1_hasSentence1 Job1_hasSentence3

This is sentence 1. This is sentence 2. This is sentence 3.

hasJobID

plainText

hasContentType

CNGL ContentModel

c l a s s O f

Dies ist Satz 1. English

hasSourceLang

German

hasTargetLang

Tweet

hasCategory

Human Translation

classOf

Das ist Satz 1. Machine Translation

T r a n s l a t e d U s i n g

CNGL ServiceModel

Job1_hasSentence2

“Dominic”

TranslatedBy

slide-7
SLIDE 7

7

Provenance… The history of an artifact.

Open Provenance Vocabulary

  • Lightweight version of Open Provenance Model
  • http://open-biomed.sourceforge.net/opmv/ns.html

CNGL Content Model CNGL Service Model

slide-8
SLIDE 8

8

Provenance: (1)

CNGL Service Model CNGL Content Model

slide-9
SLIDE 9

9

Provenance: (2)

CNGL Service Model CNGL Content Model

slide-10
SLIDE 10

10

ITS 2.0 Test-bed Workflow Orchestration Post- Editing MT Feedback LD @ the LSP

http://www.panacea-lr.eu/ http://www.vistatec.com/ http://www.localisation.ie/solas/

!

http://www.w3.org/International/multilingualweb/lt/

Future Plans…

slide-11
SLIDE 11

11

Further Reading…

"A Semantic Model for Integrated Content Management, Localisation and Language Technology Processing" presented at the 2nd Workshop on the Multilingual Semantic Web Bonn, Germany, 23rd October 2011. "Linking Localisation and Language Resources" presented at the Linked Data in Linguistics Workshop, Frankfurt/Main, Germany March 7 – 9, 2012. “Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market” presented at Language Resources and Evaluation (LREC) May 2012.

ML (Semantic) Web

Lang

Resources

L10n

MLW LOD

Dave.Lewis@scss.tcd.ie Dominic.Jones@scss.tcd.ie Leroy.Finn@scss.tcd.ie