Multilingual mark-up of text-audio synchronization at a - - PowerPoint PPT Presentation

multilingual mark up of text audio synchronization at a
SMART_READER_LITE
LIVE PREVIEW

Multilingual mark-up of text-audio synchronization at a - - PowerPoint PPT Presentation

Multilingual mark-up of text-audio synchronization at a word-by-word level, how HTML5 may assist in-browser solutions Gavin Brelstaff (gjb@ crs4.it) CRS4,


slide-1
SLIDE 1

MLW Rome 2013 G.Brelstaff & F.Chessa 1

Multilingual ¡mark-­‑up ¡of ¡ text-­‑audio ¡synchronization ¡ at ¡a ¡word-­‑by-­‑word ¡level, ¡how ¡ HTML5 may ¡assist ¡in-­‑browser ¡solutions ¡

Gavin Brelstaff (gjb@ crs4.it) CRS4, Sardinia, Italy Francesca Chessa University of Sassari, Italy Multilingual Web Workshop Rome March 2013

slide-2
SLIDE 2

MLW Rome 2013 G.Brelstaff & F.Chessa 2

First, to the movies

slide-3
SLIDE 3

MLW Rome 2013 G.Brelstaff & F.Chessa 3

Movies subtitles

slide-4
SLIDE 4

MLW Rome 2013 G.Brelstaff & F.Chessa 4

HTML5 video

<video src=“Cyrano.ogv"> <track kind="subtitles" label="English" src=“Cyrano_en.vtt" srclang="en" default/> </video>

slide-5
SLIDE 5

MLW Rome 2013 G.Brelstaff & F.Chessa 5

HTML5 audio

<audio> <track kind="subtitles" label="English" src= "file_en.srt" srclang="en" default/> <source src="file-RU.ogg" type="audio/ogg"/> <source src="file-RU.mp3" type="audio/mpeg" /> Your browser does not support the audio element. </audio>

Simply supply the vtt or srt timed-text file and the browser does it all for you line by line.

slide-6
SLIDE 6

MLW Rome 2013 G.Brelstaff & F.Chessa 6

Timed-text audio on the web:

http://commons.wikimedia.org/wiki/TimedText:GraziaDeledda.ogg.en.srt

slide-7
SLIDE 7

MLW Rome 2013 G.Brelstaff & F.Chessa 7

Timed-text audio srt:

slide-8
SLIDE 8

MLW Rome 2013 G.Brelstaff & F.Chessa 8

Speech to Text

digital spectrogram

Speech analysis credit: Carlo Schirru, Univ. Sassari

Aspetti fonetico-fonologici introduttivi all’analisi strumentale sull’intonazione del sardo (2006)

slide-9
SLIDE 9

MLW Rome 2013 G.Brelstaff & F.Chessa 9

Demo ¡+ ¡'med-­‑text ¡

slide-10
SLIDE 10

MLW Rome 2013 G.Brelstaff & F.Chessa 10

A human marks up the equivalances bewteen bilingual texts at three different levels: word, phrase, idea. word phrase idea Web-based alignment and presentation of semantic equivalence [XHTML + CSS + jQuery]

Colour-coded equivalence

Multilingual markup - recap

slide-11
SLIDE 11

MLW Rome 2013 G.Brelstaff & F.Chessa 11

HTML under the hood

<audio> ... </audio> <span class="poem" lang="en"> ...

<br unit="stanza" class="milestone"/> <span class="s">

<span class="phr"> <span class="w" n="ru:И_удивило" type="parap" start="22.9s" end="24.8s">Astonished was</span> <span class="w" n="ru:меня" start="24.8s" end="25.5s">I: </span> </span> <span class="phr"> <br class="lb"/> <span class="w" n="ru:как" type="parap" start="25.5s" end="25.9s">by </span> <span class="w" n="ru:спокойны" start="25.9s" end="26.7s">the hush over </span> <span class="w" n="ru:воды" start="26.7s" end="27.9s">water </span> </span> </span> ... </span>

Timed Text Markup

(TTML) W3C WD 1.0 31 Jan 2013

slide-12
SLIDE 12

MLW Rome 2013 G.Brelstaff & F.Chessa 12

Archive format: XML TEI

<text><body> <div type="poem" xml:lang="en"><p>… <s><phr> <milestone unit="stanza"/> <lb/> <milestone unit="cue" n="22.9s"/> <w n="ru:И_удивило" type="parap">Astonished was</w> <milestone unit="cue" n="24.8s"/> <w n="ru:меня">I:</w> </phr><phr> <lb/> <milestone unit="cue" n="25.5s"/> <w n="ru:как" type="parap">by</w> <milestone unit="cue" n="25.9s"/> <w n="ru:спокойны">the hush over</w> <milestone unit="cue" n="26.7s"/> <w n="ru:воды">water</w> </phr> <lb/> ...

Add one TEI milestone “anchor” per audio cue-point

Text Encoding Initiative P5, 2012

slide-13
SLIDE 13

MLW Rome 2013 G.Brelstaff & F.Chessa 13

HTML5 audio tag

<audio id="audio" nocontrols> <source src="01-RU.ogg" type="audio/ogg"> <source src="01-RU.mp3" type="audio/mpeg"> Your browser does not support the audio element. </audio> var myAudio=$('#audio'); // jQuery selector myAudio.get(0).currentTime = 15.5 //secs myAudio.get(0).play(); // start HTML5 audio

HTML code Javascript audio play

setTimeout('switch_on (... )', start_ms ); // times in setTimeout('switch_off(... )', end_ms ); // milliseconds

Javascript text sync (scarry stuff instead) No <track> subtitles here

See also: westonruter-html5-audio-read-along on github

slide-14
SLIDE 14

MLW Rome 2013 G.Brelstaff & F.Chessa 14

Cue-point mark-up tools?

www.nikse.dk/subtitleedit

slide-15
SLIDE 15

MLW Rome 2013 G.Brelstaff & F.Chessa 15

Cue-point mark-up tools?

www.fon.hum.uva.nl/praat

slide-16
SLIDE 16

MLW Rome 2013 G.Brelstaff & F.Chessa 16

Cue-point mark-up (visual interface)

Insert & nudge cue-points directly on the web-page while listening

slide-17
SLIDE 17

MLW Rome 2013 G.Brelstaff & F.Chessa 17 Nel mezzo del cammin di nostra vita mi ritrovai per una selva oscura ché la diritta via era smarrita

Our aim: to activate poetic memory

Involve ear, tongue and eyes to reinforce memory/ appreciation across the language divide.

slide-18
SLIDE 18

MLW Rome 2013 G.Brelstaff & F.Chessa 18

"We preferred poems that make a powerful impact when they are heard aloud - not because they are theatrical, but because they dramatise experiences that surprise us into a new apprehension of

  • urselves and our capacity for

imagining, thinking and marvelling." Mr Gove said the project would ensure that more children would be captivated by great poetry and it would help "pass our cultural legacy

  • n to the next generation".
slide-19
SLIDE 19

MLW Rome 2013 G.Brelstaff & F.Chessa 19

Caesar’s Europe: poetic memory

http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.02.0001%3Abook%3D6%3Achapter%3D14

The Druids … learn by heart a great number of verses; …. Nor do they regard it lawful to commit these to writing …

slide-20
SLIDE 20

MLW Rome 2013 G.Brelstaff & F.Chessa 20

Juliane Stiller, Marlies Olensky MLW Dublin 2012

Internal: Immediately available to society (in cache) Appreciation, comprehension across the language divide? External: Available

  • n demand

(in digital archive)

Poetic memory internal external

slide-21
SLIDE 21

MLW Rome 2013 G.Brelstaff & F.Chessa 21

  • Information is not knowledge
  • knowledge is not wisdom
  • wisdom is not truth …

F.Zappa 1979

slide-22
SLIDE 22

MLW Rome 2013 G.Brelstaff & F.Chessa 22

Poetic memory informs society

1562 Arthur Brooke 1597 William Shakespeare Prose plot (information) Poetic language (information plus) Able to inform society Lost on us

slide-23
SLIDE 23

MLW Rome 2013 G.Brelstaff & F.Chessa 23

Back to the movies – an extreme social network

Learning by rote

  • r

Learning by heart?

slide-24
SLIDE 24

MLW Rome 2013 G.Brelstaff & F.Chessa 24

http://www.youtube.com/watch?v=ZriW3CPU9G4&list=PLGGjdQw3TIx9Dk0CYaHS9R6LyI_HmtG9i

slide-25
SLIDE 25

MLW Rome 2013 G.Brelstaff & F.Chessa 25

That’s all folks:

Gavin Brelstaff (gjb@ crs4.it) CRS4 09010 Pula (CA) – Sardinia, Italy Francesca Chessa University of Sassari, Italy