Summarization and Annotation of Meetings Wessel Kraaij Martijn - - PowerPoint PPT Presentation

▶

Dec 19, 2022 284 likes •561 views

Summarization and Annotation of Meetings Wessel Kraaij Martijn Spitters Hap Kolb Objectives Choose plausible interaction types of a user with archive generated by M4 Exploratory effort to define annotation scheme for meetings data

SLIDE 1

Summarization and Annotation

f Meetings

Wessel Kraaij Martijn Spitters Hap Kolb

SLIDE 2

Objectives

Choose plausible interaction types of a user

with archive generated by M4

Exploratory effort to define annotation

scheme for meetings data (e.g. M4 or parliament )

Evaluation of annotation tools

SLIDE 3

User access to a meeting archive

View summary

– Summary of a missed meeting

Indexed segments, high level annotation, textual summary?
Browse

– Exploratory search, zooming in /out

Navigation structure
Search

– Select meeting segments

Indexed segments

SLIDE 4

Issues

Segmentation of recordings

– At which granularity? – Overlapping – Segment clustering (nested?) – Channel based?

Feature detection

– Which low level features – Higher level (span multiple segments)

Indexing: use relational DB or XML DB?

SLIDE 5

* Summarizing different media types *

SLIDE 6

Goal of summarization

Preserve the “most

important information” in a document.

Make use of

redundancy in text

Maximize information

density

Compression Ratio =

|S| |D|

Retention Ratio =

i (S) i (D)

Goal:

i (S) i (D) |S| |D|

>

SLIDE 7

Summarization architecture

What do human summarizers do?

– A: Start from scratch: analyze, transform, synthesize (top-down) – B: Select material and revise: “cut and paste summarization”: bottom-up (Jing & McKeown-1999)

Automatic systems:

– Extraction: selection of material – Revision: reduction, combination, syntactic transformation, paraphrasing, generalization, sentence reordering complexity

SLIDE 8

Document Summarization: Extracts vs Abstracts

Sentence extracts: robust but poor coherence

– Determine salience based on sentence position, cue phrases, #content terms, sentence length etc.

Abstracts:

– polished extracts – Or use domain dependent generation from templates – Can apply generalisation

SLIDE 9

Required knowledge

lexical level local context discourse global world bag of words bigrams/trigrams referential links structure knowledge Ad Hoc IR QA Sentence selection Sentence reduction Sentence combination syntactic transformation lexical paraphrase generalization/specification sentence reordering dependency structure

SLIDE 10

Dialogue Summarization

(CMU Klaus Zechner)

Speech disfluency removal
Identification and insertion of sentence boundaries
Identification and linking of question-answer

regions

Topical segmentation (TextTiling/Hearst)
Information condensation
Genres: Call-me, group meetings (CMU),

dialogue oriented television shows (e.g. Crossfire)

Transcript based, no prosody

Much more difficult for multi person dialogue?

SLIDE 11

Video Summarizaton

Video

– Sequence of logical story units (scenes)

Sequence of shots

– Sequence of frames

Storyboard summary

– Presentation of one frame per shot (static shots) – More frames per shot (lots of movement)

Summary frames can be grouped into logical story

units, e.g. by TextTiling based on transcripts

Video (or audio) summaries are extracts!

SLIDE 12

Evaluation

The big problem of evaluating generic summaries

is that there is no single gold standard summary.

Possible way-out (DUC2003): use scenarios

– Task based summary e.g triggered by a certain question – Viewpoint summary: e.g. financial perspective on a cluster of docs about an earthquake

SLIDE 13

Meeting Summaries

Goal: short textual summary, in addition some

thumbnail images link to key fragments of the meeting

– Textual summary:

Sentence extraction is probably not suitable.
Alternative: generation from domain specific templates in

combination with topic spotting and information extraction

– Storyboard summary:

Navigation points: opening, summary, decision, vote etc
Lively discussion, jokes etc.

SLIDE 14

Example viewpoints

Result oriented: e.g. decisions, actions

assigned, (votes)

Focus on issues that triggered a lot of

discussion

Focus on a certain participant
Etc etc.

SLIDE 15

Generation of an AV viewpoint summary

(Automatically) annotate a recorded

meeting: XML file

Apply a viewpoint based XSLT

transformation

Generate a SMIL file.

SLIDE 16

Dummy browse interface

SLIDE 17

* Annotating meetings *

SLIDE 18

Function of annotation

Features and higher level elements help to

segment data in various granularity levels

Annotated elements can be indexed

– Supports focused summaries – Supports retrieval

Hierarchical relations between elements

enable browsing

SLIDE 19

Annotation structure

There’s many levels of annotation possible
Constrained by

– What are realistically detectable features – What kind of annotation is necessary and interesting from the application perspective (summary, browsing, searching)

SLIDE 20

Levels of annotation

Features: directly observed from the visual or

audio signal

Speech segments, prosody, laughter, silence movement

(nodding, taking notes, pointing ….), shot change

Interpretation layers: are inferred bottom up (e.g

by chunking using local or global context)

Low level elements: speech transcripts, utterances, dialogue

acts, speaker turns, sentences

High level elements: Meeting structure: topics and agenda

management, dynamics, mood, interaction types (e.g. monologue, dialogue, discussion etc)

(Hierarchical)
Relations between different elements and possibly between

different levels

SLIDE 21

MEETING

STRUCTURE BLOCK STRUCTURE BLOCK STRUCTURE BLOCK STRUCTURE BLOCK

HAS_A

TYPE

INTERACTION INTERACTION INTERACTION

HAS_A

TYPE

HAS_A

Monologue
Dialogue
Supervised (chaired)
Unsupervised

Top-down view

SLIDE 22

INTERACTION TYPE RECOGNITION

1. MONOLOGUE

Chairman Speaker Interrupter Audience/others

2. (CHAIRED) DISCUSSION

Chairman Speaker 1 Speaker 2 Audience/others

TURN

HAS_A HAS_A

ID SPEAKER NAME/ID

HAS_A

TURN ELEMENT TURN ELEMENT

HAS_A

FUNCTION

SUBFUNCTION HAS_A HAS_A

HAS_A

SUBJECT

SLIDE 23

Annotation standard for M4

Reuse and extend existing efforts as much

as possible e.g.

– DAMSL for dialogue act markup – Gesture markup scheme from ANVIL – ICSI MR for speech transcripts?

Separate tracks for different speakers
Global track for features like structure,

mood etc.

SLIDE 24

Annotation tools(1) IBM VideoAnnex:

Shot based (contains shot segmentation

module)

MPEG-1 input, MPEG-7 annotation
Annotation scheme can be changed while

editing

No relations between elements
Image oriented
No visual display of annotations

SLIDE 25

Annotation Tools(2) ANVIL (DFKI)

No automatic shot segmentation
Timeline gives visual overview of

annotations

Extendible

– No on-line editing of annotation DTD – Screen refresh problems (windows 2000) – No complex relations between elements – JMF based: limited number of codecs supported

SLIDE 26

Example viewpoint summary

ANVIL at work
Query: who made the best joke?

Summarization and Annotation

Wessel Kraaij Martijn Spitters Hap Kolb

Objectives

with archive generated by M4

scheme for meetings data (e.g. M4 or parliament )

User access to a meeting archive

– Summary of a missed meeting

– Exploratory search, zooming in /out

– Select meeting segments

Issues

– At which granularity? – Overlapping – Segment clustering (nested?) – Channel based?

– Which low level features – Higher level (span multiple segments)

* Summarizing different media types *

Goal of summarization

important information” in a document.

redundancy in text

density

Compression Ratio =

Retention Ratio =

Goal:

>

Summarization architecture

– A: Start from scratch: analyze, transform, synthesize (top-down) – B: Select material and revise: “cut and paste summarization”: bottom-up (Jing & McKeown-1999)

– Extraction: selection of material – Revision: reduction, combination, syntactic transformation, paraphrasing, generalization, sentence reordering complexity

Document Summarization: Extracts vs Abstracts

– Determine salience based on sentence position, cue phrases, #content terms, sentence length etc.

– polished extracts – Or use domain dependent generation from templates – Can apply generalisation

Required knowledge

Dialogue Summarization

(CMU Klaus Zechner)

regions

dialogue oriented television shows (e.g. Crossfire)

Much more difficult for multi person dialogue?

Video Summarizaton

– Sequence of logical story units (scenes)

– Presentation of one frame per shot (static shots) – More frames per shot (lots of movement)

units, e.g. by TextTiling based on transcripts

Evaluation

is that there is no single gold standard summary.

– Task based summary e.g triggered by a certain question – Viewpoint summary: e.g. financial perspective on a cluster of docs about an earthquake

Meeting Summaries

thumbnail images link to key fragments of the meeting

– Textual summary:

– Storyboard summary:

Example viewpoints

assigned, (votes)

discussion

Generation of an AV viewpoint summary

meeting: XML file

transformation

Dummy browse interface

* Annotating meetings *

Function of annotation

segment data in various granularity levels

– Supports focused summaries – Supports retrieval

enable browsing

Annotation structure

– What are realistically detectable features – What kind of annotation is necessary and interesting from the application perspective (summary, browsing, searching)

Levels of annotation

audio signal

by chunking using local or global context)

MEETING

Top-down view

INTERACTION TYPE RECOGNITION

Annotation standard for M4

as possible e.g.

– DAMSL for dialogue act markup – Gesture markup scheme from ANVIL – ICSI MR for speech transcripts?

mood etc.

Annotation tools(1) IBM VideoAnnex:

module)

editing

Annotation Tools(2) ANVIL (DFKI)

annotations

– No on-line editing of annotation DTD – Screen refresh problems (windows 2000) – No complex relations between elements – JMF based: limited number of codecs supported

Example viewpoint summary

– Darren? – Ian? – Daniel? – Steve? – Pierre?