Thesis presentation Event Extraction from Text and Translation to - - PowerPoint PPT Presentation

thesis presentation
SMART_READER_LITE
LIVE PREVIEW

Thesis presentation Event Extraction from Text and Translation to - - PowerPoint PPT Presentation

Thesis presentation Event Extraction from Text and Translation to Event Calculus Geert Heyman June 2014 Geert Heyman (KULeuven) Thesis presentation June 2014 1 / 21 Outline Problem Definition & Goals 1 Approach 2 Extracting event


slide-1
SLIDE 1

Thesis presentation

Event Extraction from Text and Translation to Event Calculus Geert Heyman June 2014

Geert Heyman (KULeuven) Thesis presentation June 2014 1 / 21

slide-2
SLIDE 2

Outline

1

Problem Definition & Goals

2

Approach Extracting event information Translating event information Specifying background knowledge Making inferences about the story

3

Results

4

Conclusions & Future Work

Geert Heyman (KULeuven) Thesis presentation June 2014 2 / 21

slide-3
SLIDE 3

Outline

1

Problem Definition & Goals

2

Approach Extracting event information Translating event information Specifying background knowledge Making inferences about the story

3

Results

4

Conclusions & Future Work

Geert Heyman (KULeuven) Thesis presentation June 2014 3 / 21

slide-4
SLIDE 4

Problem Definition

Understanding a story by:

Geert Heyman (KULeuven) Thesis presentation June 2014 4 / 21

slide-5
SLIDE 5

Problem Definition

Understanding a story by:

  • 1. Extracting event related information.

Geert Heyman (KULeuven) Thesis presentation June 2014 4 / 21

slide-6
SLIDE 6

Problem Definition

Understanding a story by:

  • 1. Extracting event related information.
  • 2. Translating event related information to logic, using a fixed set a

primitives (predicate,types).

Geert Heyman (KULeuven) Thesis presentation June 2014 4 / 21

slide-7
SLIDE 7

Problem Definition

Understanding a story by:

  • 1. Extracting event related information.
  • 2. Translating event related information to logic, using a fixed set a

primitives (predicate,types).

  • 3. Specifying the meaning of the primitives in a logical theory (the

background knowledge specification).

Using event calculus

Geert Heyman (KULeuven) Thesis presentation June 2014 4 / 21

slide-8
SLIDE 8

Problem Definition

Understanding a story by:

  • 1. Extracting event related information.
  • 2. Translating event related information to logic, using a fixed set a

primitives (predicate,types).

  • 3. Specifying the meaning of the primitives in a logical theory (the

background knowledge specification).

Using event calculus

  • 4. Making inferences about the story by combining the obtained

translation with the background knowledge specification

Geert Heyman (KULeuven) Thesis presentation June 2014 4 / 21

slide-9
SLIDE 9

Goals

Make the extraction and translation steps more general.

Geert Heyman (KULeuven) Thesis presentation June 2014 5 / 21

slide-10
SLIDE 10

Goals

Make the extraction and translation steps more general. Minimize the amount of story specific knowledge

Geert Heyman (KULeuven) Thesis presentation June 2014 5 / 21

slide-11
SLIDE 11

Goals

Make the extraction and translation steps more general. Minimize the amount of story specific knowledge

Specify the story setting (cities, train lines, ...) But don’t specify the characters, their intial locations, ...

Geert Heyman (KULeuven) Thesis presentation June 2014 5 / 21

slide-12
SLIDE 12

Goals

Make the extraction and translation steps more general. Minimize the amount of story specific knowledge

Specify the story setting (cities, train lines, ...) But don’t specify the characters, their intial locations, ...

Make useful inferences:

information implied by the text implicatures in the text (that what is suggested, but not implied)

Geert Heyman (KULeuven) Thesis presentation June 2014 5 / 21

slide-13
SLIDE 13

Outline

1

Problem Definition & Goals

2

Approach Extracting event information Translating event information Specifying background knowledge Making inferences about the story

3

Results

4

Conclusions & Future Work

Geert Heyman (KULeuven) Thesis presentation June 2014 6 / 21

slide-14
SLIDE 14
  • 1. Extracting event information

Geert Heyman (KULeuven) Thesis presentation June 2014 7 / 21

slide-15
SLIDE 15
  • 1. Extracting event information

By assigning labels to words or phrases

Geert Heyman (KULeuven) Thesis presentation June 2014 7 / 21

slide-16
SLIDE 16
  • 1. Extracting event information

By assigning labels to words or phrases E.g., John entersenter.01 the train.

Geert Heyman (KULeuven) Thesis presentation June 2014 7 / 21

slide-17
SLIDE 17
  • 1. Extracting event information

By assigning labels to words or phrases E.g., John entersenter.01 the train. Combining two state-of-the-art extraction tools:

TERENCE Semantic Role Labeller

Geert Heyman (KULeuven) Thesis presentation June 2014 7 / 21

slide-18
SLIDE 18
  • 1. Extracting event information

By assigning labels to words or phrases E.g., John entersenter.01 the train. Combining two state-of-the-art extraction tools:

TERENCE Semantic Role Labeller

Does not rely on a stereotypical order of events → generally applicable

Geert Heyman (KULeuven) Thesis presentation June 2014 7 / 21

slide-19
SLIDE 19
  • 1. Extracting event information: TERENCE

Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-20
SLIDE 20
  • 1. Extracting event information: TERENCE

Recognizes basic story elements and the relations between them:

Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-21
SLIDE 21
  • 1. Extracting event information: TERENCE

Recognizes basic story elements and the relations between them: Story entities and references to story entities

Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-22
SLIDE 22
  • 1. Extracting event information: TERENCE

Recognizes basic story elements and the relations between them: Story entities and references to story entities Johnentity mention→John forgot hisentity mention→John wallet.

Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-23
SLIDE 23
  • 1. Extracting event information: TERENCE

Recognizes basic story elements and the relations between them: Story entities and references to story entities Johnentity mention→John forgot hisentity mention→John wallet. Story events and their temporal relations

Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-24
SLIDE 24
  • 1. Extracting event information: TERENCE

Recognizes basic story elements and the relations between them: Story entities and references to story entities Johnentity mention→John forgot hisentity mention→John wallet. Story events and their temporal relations John forgotevent1 his wallet, but an honest finder returnedevent2 it.

AFTER Geert Heyman (KULeuven) Thesis presentation June 2014 8 / 21

slide-25
SLIDE 25
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-26
SLIDE 26
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-27
SLIDE 27
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Annotates predicates with their meaning

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-28
SLIDE 28
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Annotates predicates with their meaning E.g., John entersenter.01 the train.

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-29
SLIDE 29
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Annotates predicates with their meaning E.g., John entersenter.01 the train. Annotates the relationship that words/constituents have with a

  • predicate. I.e. their semantic role.

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-30
SLIDE 30
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Annotates predicates with their meaning E.g., John entersenter.01 the train. Annotates the relationship that words/constituents have with a

  • predicate. I.e. their semantic role.

E.g., JohnA0 of enter.01 enters the trainA1 of enter.01.

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-31
SLIDE 31
  • 1. Extracting event information: Semantic Role Labeller

Annotates words/constituents with syntactic and semantic information: Syntactic information: a syntactical dependence tree, word lemma’s Semantic information:

Annotates predicates with their meaning E.g., John entersenter.01 the train. Annotates the relationship that words/constituents have with a

  • predicate. I.e. their semantic role.

E.g., JohnA0 of enter.01 enters the trainA1 of enter.01.

The tool output distributions of possible labels for a token, instead of a single one.

Geert Heyman (KULeuven) Thesis presentation June 2014 9 / 21

slide-32
SLIDE 32
  • 1. Extracting event information: TERENCE + Semantic

Role Labeller

Geert Heyman (KULeuven) Thesis presentation June 2014 10 / 21

slide-33
SLIDE 33
  • 2. Translating event information

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-34
SLIDE 34
  • 2. Translating event information

For every event annotated by TERENCE:

  • 1. Translate the temporal relations associated with the event

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-35
SLIDE 35
  • 2. Translating event information

For every event annotated by TERENCE:

  • 1. Translate the temporal relations associated with the event
  • 2. Generate annotation hypotheses

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-36
SLIDE 36
  • 2. Translating event information

For every event annotated by TERENCE:

  • 1. Translate the temporal relations associated with the event
  • 2. Generate annotation hypotheses
  • 3. Translate annotation hypotheses to translation hypotheses (logic

sentences)

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-37
SLIDE 37
  • 2. Translating event information

For every event annotated by TERENCE:

  • 1. Translate the temporal relations associated with the event
  • 2. Generate annotation hypotheses
  • 3. Translate annotation hypotheses to translation hypotheses (logic

sentences)

  • 4. Filter translation hypotheses that are inconsistent with previously

translated events (using logical inference)

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-38
SLIDE 38
  • 2. Translating event information

For every event annotated by TERENCE:

  • 1. Translate the temporal relations associated with the event
  • 2. Generate annotation hypotheses
  • 3. Translate annotation hypotheses to translation hypotheses (logic

sentences)

  • 4. Filter translation hypotheses that are inconsistent with previously

translated events (using logical inference)

  • 5. Select one of the consistent hypotheses using a heuristic(if there is

more than one translation hypothesis)

Geert Heyman (KULeuven) Thesis presentation June 2014 11 / 21

slide-39
SLIDE 39
  • 2. Translating event information: annotation hypotheses

→ translation hypotheses

Geert Heyman (KULeuven) Thesis presentation June 2014 12 / 21

slide-40
SLIDE 40
  • 2. Translating event information: annotation hypotheses

→ translation hypotheses

Given: an annotation hypothesis enter.01[A0 = John, A1 = compartment]

Geert Heyman (KULeuven) Thesis presentation June 2014 12 / 21

slide-41
SLIDE 41
  • 2. Translating event information: annotation hypotheses

→ translation hypotheses

Given: an annotation hypothesis enter.01[A0 = John, A1 = compartment] Output: a logic sentence expressed in the vocabulary of the background knowledge specification

Geert Heyman (KULeuven) Thesis presentation June 2014 12 / 21

slide-42
SLIDE 42
  • 2. Translating event information: annotation hypotheses

→ translation hypotheses

Given: an annotation hypothesis enter.01[A0 = John, A1 = compartment] Output: a logic sentence expressed in the vocabulary of the background knowledge specification Procedure:

  • 1. Map the semantic roles to FO(·)IDP-types
  • 2. Instantiate the type to a constant using the entity name (using

coreference relations, he → John)

  • 3. Map the semantic predicate names to an FO(·)IDP-predicate
  • 4. Instantiate the FO(·)IDP-predicate with the constants from step 2

Geert Heyman (KULeuven) Thesis presentation June 2014 12 / 21

slide-43
SLIDE 43
  • 2. Translating event information: annotation hypotheses

→ translation hypotheses

Given: an annotation hypothesis enter.01[A0 = John, A1 = compartment] Output: a logic sentence expressed in the vocabulary of the background knowledge specification Procedure:

  • 1. Map the semantic roles to FO(·)IDP-types
  • 2. Instantiate the type to a constant using the entity name (using

coreference relations, he → John)

  • 3. Map the semantic predicate names to an FO(·)IDP-predicate
  • 4. Instantiate the FO(·)IDP-predicate with the constants from step 2

The procedure uses the WordNet ontology to map semantic roles to types and semantic predicates to FO(·)IDP-predicates (synonyms, hypernyms)

Geert Heyman (KULeuven) Thesis presentation June 2014 12 / 21

slide-44
SLIDE 44
  • 3. Specifying background knowledge

Geert Heyman (KULeuven) Thesis presentation June 2014 13 / 21

slide-45
SLIDE 45
  • 3. Specifying background knowledge

Expressed in FO(·)IDP-language

Geert Heyman (KULeuven) Thesis presentation June 2014 13 / 21

slide-46
SLIDE 46
  • 3. Specifying background knowledge

Expressed in FO(·)IDP-language Using event calculus

Geert Heyman (KULeuven) Thesis presentation June 2014 13 / 21

slide-47
SLIDE 47
  • 3. Specifying background knowledge

Expressed in FO(·)IDP-language Using event calculus Consists of:

Vocabulary Theory Part of a structure

Geert Heyman (KULeuven) Thesis presentation June 2014 13 / 21

slide-48
SLIDE 48
  • 3. Specifying background knowledge

Expressed in FO(·)IDP-language Using event calculus Consists of:

Vocabulary Theory Part of a structure

Not all events in the story are modelled in the background theory

Geert Heyman (KULeuven) Thesis presentation June 2014 13 / 21

slide-49
SLIDE 49
  • 4. Making inferences about the story

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-50
SLIDE 50
  • 4. Making inferences about the story

How? The translation module:

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-51
SLIDE 51
  • 4. Making inferences about the story

How? The translation module:

writes the translation to a well known file calls the IDP-system to perform inference

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-52
SLIDE 52
  • 4. Making inferences about the story

How? The translation module:

writes the translation to a well known file calls the IDP-system to perform inference

Kinds of inferences:

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-53
SLIDE 53
  • 4. Making inferences about the story

How? The translation module:

writes the translation to a well known file calls the IDP-system to perform inference

Kinds of inferences:

Generating story hypotheses: a time-line of the events mentioned in the story + events inferred by the system (implied, or implicatures)

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-54
SLIDE 54
  • 4. Making inferences about the story

How? The translation module:

writes the translation to a well known file calls the IDP-system to perform inference

Kinds of inferences:

Generating story hypotheses: a time-line of the events mentioned in the story + events inferred by the system (implied, or implicatures) Evaluating FO(·)IDP-statements

Geert Heyman (KULeuven) Thesis presentation June 2014 14 / 21

slide-55
SLIDE 55
  • 4. Making inferences about the story: Story Hypotheses

Geert Heyman (KULeuven) Thesis presentation June 2014 15 / 21

slide-56
SLIDE 56

Outline

1

Problem Definition & Goals

2

Approach Extracting event information Translating event information Specifying background knowledge Making inferences about the story

3

Results

4

Conclusions & Future Work

Geert Heyman (KULeuven) Thesis presentation June 2014 16 / 21

slide-57
SLIDE 57

Results

TERENCE annotations Event annotation hypotheses Translation accuracy Story Understanding

Geert Heyman (KULeuven) Thesis presentation June 2014 17 / 21

slide-58
SLIDE 58

Translation Accuracy

6 out of 17 events are translated to logic

Geert Heyman (KULeuven) Thesis presentation June 2014 18 / 21

slide-59
SLIDE 59

Translation Accuracy

6 out of 17 events are translated to logic Only 7 out of 17 events corresponded to a primitive in the background knowledge specification

Geert Heyman (KULeuven) Thesis presentation June 2014 18 / 21

slide-60
SLIDE 60

Translation Accuracy

6 out of 17 events are translated to logic Only 7 out of 17 events corresponded to a primitive in the background knowledge specification The one event that was not translated had a wrong coreference relation

Geert Heyman (KULeuven) Thesis presentation June 2014 18 / 21

slide-61
SLIDE 61

Translation Accuracy

6 out of 17 events are translated to logic Only 7 out of 17 events corresponded to a primitive in the background knowledge specification The one event that was not translated had a wrong coreference relation 5 out of 6 translations was correct

Geert Heyman (KULeuven) Thesis presentation June 2014 18 / 21

slide-62
SLIDE 62

Translation Accuracy

6 out of 17 events are translated to logic Only 7 out of 17 events corresponded to a primitive in the background knowledge specification The one event that was not translated had a wrong coreference relation 5 out of 6 translations was correct 1 translation was wrong due to a wrong coreference relation

Geert Heyman (KULeuven) Thesis presentation June 2014 18 / 21

slide-63
SLIDE 63

Story Understanding

Evaluated 40 statements provided by 8 different test subjects

Logical truth (true,false,maybe) 9.00% Common sense truth (true,false) 52.5%

Some statements couldn’t be expressed in the background knowledge

  • specification. Common sense truth accuracy = 70.0 % when not

taking these statements into account

Geert Heyman (KULeuven) Thesis presentation June 2014 19 / 21

slide-64
SLIDE 64

Outline

1

Problem Definition & Goals

2

Approach Extracting event information Translating event information Specifying background knowledge Making inferences about the story

3

Results

4

Conclusions & Future Work

Geert Heyman (KULeuven) Thesis presentation June 2014 20 / 21

slide-65
SLIDE 65

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-66
SLIDE 66

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-67
SLIDE 67

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-68
SLIDE 68

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting Smarter generation of annotation hypotheses from the distributions

Using logical specification

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-69
SLIDE 69

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting Smarter generation of annotation hypotheses from the distributions

Using logical specification

Obtaining a more complete translation E.g., ”In Lier, he stands up...” → At(John, Lier, Tx)

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-70
SLIDE 70

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting Smarter generation of annotation hypotheses from the distributions

Using logical specification

Obtaining a more complete translation E.g., ”In Lier, he stands up...” → At(John, Lier, Tx) How well does it scale?

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-71
SLIDE 71

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting Smarter generation of annotation hypotheses from the distributions

Using logical specification

Obtaining a more complete translation E.g., ”In Lier, he stands up...” → At(John, Lier, Tx) How well does it scale?

Extending the background knowledge Larger stories

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21

slide-72
SLIDE 72

Conclusions & Future Work

The used Information Extraction and Translation approach is more general than a script-based approach Can make useful inferences with reasonable accuracy Story specific knowledge is limited to the story setting Smarter generation of annotation hypotheses from the distributions

Using logical specification

Obtaining a more complete translation E.g., ”In Lier, he stands up...” → At(John, Lier, Tx) How well does it scale?

Extending the background knowledge Larger stories

Geert Heyman (KULeuven) Thesis presentation June 2014 21 / 21