The ICSI corpus; Browsing meetings nlssd natural language and speech - - PowerPoint PPT Presentation

the icsi corpus browsing meetings
SMART_READER_LITE
LIVE PREVIEW

The ICSI corpus; Browsing meetings nlssd natural language and speech - - PowerPoint PPT Presentation

The ICSI corpus; Browsing meetings nlssd natural language and speech system design . Steve Renals s.renals@ed.ac.uk NLSSD/Corpus and Browsing p.1/14 Overview The ICSI meetings corpus Browsing meetings Useful reading: Janin et


slide-1
SLIDE 1

The ICSI corpus; Browsing meetings

nlssd – natural language and speech system design.

Steve Renals

s.renals@ed.ac.uk

NLSSD/Corpus and Browsing – p.1/14

slide-2
SLIDE 2

Overview

  • The ICSI meetings corpus
  • Browsing meetings
  • Useful reading: Janin et al (2003; 2004), Kazman et al

(1996), Tucker and Whittaker (2004) Web: http://www.inf.ed.ac.uk/teaching/courses/nlssd/

NLSSD/Corpus and Browsing – p.2/14

slide-3
SLIDE 3

The ICSI Meetings Corpus

  • Collected at the International Computer Science Institute,

Berkeley from 2000–2002

  • Natural (rather than scenario-driven) meetings, mainly of

ICSI research groups

  • Audio only, recorded using close-talking head-mounted

mics and 4 desktop PZM mics

  • Audio stored as 16 kHZ, 16 bit linear

NLSSD/Corpus and Browsing – p.3/14

slide-4
SLIDE 4

Meeting types

  • Meetings of 3–12 people
  • 75 meetings total:
  • Bmr - the meeting recorder group - 29 meetings
  • Bro - robust ASR group - 23 mtgs
  • Bed - the EDU group (NLP) - 15 mtgs
  • Meetings were typically about 1 hour long
  • All meetings in English

NLSSD/Corpus and Browsing – p.4/14

slide-5
SLIDE 5

Meeting participants

  • 53 unique speakers:
  • 40 male, 13 female
  • Most under 40
  • 28 native English speakers, 12 German, 5 Spanish, 8
  • thers
  • Some ethical issues to consider in this research; the

corpus is somewhat anonymised

NLSSD/Corpus and Browsing – p.5/14

slide-6
SLIDE 6

Speech transcription

  • The entire corpus has been transcribed at the word level
  • Also includes word fragments, restarts, filled pauses,

back-channels, non-lexical events (cough, laugh, etc.)

  • Overlap information is available through time stamps on

each utterance

  • Each utterance is marked with the participant ID
  • Speech recognition transcriptions (29% word error rate)

also available (“fairly” done by training on 3/4 of corpus, testing on remaining 1/4 and rotating 4 times)

NLSSD/Corpus and Browsing – p.6/14

slide-7
SLIDE 7

Dialogue Act (MRDA) Annotations

  • Dialogue acts (DA) annotations: 11 general tags (eg

“statement”) and 39 further modifiers (eg “joke”, “disagreement”)

  • Includes automatic time alignment of each word and

segmentation into DA units (obtained using forced alignment with a switch

  • Adjacency pairs also marked (eg question-answer pairs)

NLSSD/Corpus and Browsing – p.7/14

slide-8
SLIDE 8

Other annotations

  • Topic segmentation (together with brief descriptions of

each topic)

  • Summarization (human-written abstracts, together with

links to the meeting extracts that support the abstract)

  • Annotations of “hot spots” and involvement

NLSSD/Corpus and Browsing – p.8/14

slide-9
SLIDE 9

Browsing meetings

  • Tucker and Whittaker categorize meeting browsers as:
  • Audio browsers (with and without visual feedback)
  • Video browsers (also include audio, but video is the

focus)

  • “Artefact” browsers (browsing based on other material

eg notes, slides, whiteboard

  • Discourse browsers - based on derived elements -

mainly transcripts, also speaker activity, involvement

  • Typically these browsers index into the audio or video,

based on discourse or artefact information

NLSSD/Corpus and Browsing – p.9/14

slide-10
SLIDE 10

Approaches to indexing

Kazman et al identified four indexing approaches:

  • Indexing by intended content (eg the agenda)
  • Indexing by actual content (eg ASR transcript)
  • Indexing by temporal structure (eg speaker turns)
  • Indexing by application record (eg artefacts such as notes,

slides, PC interaction)

NLSSD/Corpus and Browsing – p.10/14

slide-11
SLIDE 11

Browser focus

  • Indexes based on agenda, slide changes, participant

activity

  • Search and retrieval based on ASR transcript
  • Structure based on topic segmentation and tracking
  • Preview based on summarization or keyword spotting
  • Archive filtering
  • Browsers for limited resources eg phones, PDAs

NLSSD/Corpus and Browsing – p.11/14

slide-12
SLIDE 12

Example browser: Ferret

NLSSD/Corpus and Browsing – p.12/14

slide-13
SLIDE 13

Browser evaluation

  • How can browsers and browsing techniques be compared
  • bjectively?
  • Most browsers have no real evaluation; how do you know if

what you did was useful?

  • Browser Evaluation Test (BET) - finding the maximum

number of observations of interest in the minimum time

  • Observers make the observations of interest (expressed as

two contrasting statements, one of which is true, eg: Jo thought that there too many items on the agenda; Jo thought the agenda was about the right length)

  • Subjects browse the meeting to decide which observations

are true

  • Note that this only applies to browsing a single meeting (in

its current form)

NLSSD/Corpus and Browsing – p.13/14

slide-14
SLIDE 14

Next two sessions

Tuesday 18 January Component technologies and software Useful reading: Galley et al (2003), Zechner (2002), Wrede and Shriberg (2003) Friday 21 January NITE XML toolkit (Jean Carletta), and division into groups Useful reading: Carletta and Kilgour (2004) All readings downloadable from http://www.inf.ed.ac.uk/teaching/courses/nlssd/readings.html

NLSSD/Corpus and Browsing – p.14/14