Improving String Processing for Temporal Relations Tim Fernando - - PowerPoint PPT Presentation

improving string processing for temporal relations
SMART_READER_LITE
LIVE PREVIEW

Improving String Processing for Temporal Relations Tim Fernando - - PowerPoint PPT Presentation

Improving String Processing for Temporal Relations Improving String Processing for Temporal Relations Tim Fernando David Woods ADAPT Centre Computational Linguistics Group Trinity Centre for Computing and Language Studies School of Computer


slide-1
SLIDE 1

Improving String Processing for Temporal Relations

Improving String Processing for Temporal Relations

David Woods Tim Fernando

ADAPT Centre Computational Linguistics Group Trinity Centre for Computing and Language Studies School of Computer Science and Statistics Trinity College Dublin, Ireland

ISA-14, August 25th 2018

slide-2
SLIDE 2

Improving String Processing for Temporal Relations Introduction

Introduction

  • Q. What’s this talk about?
slide-3
SLIDE 3

Improving String Processing for Temporal Relations Introduction

Introduction

  • Q. What’s this talk about?
  • A. Representing temporal information compactly for reasoning

and processing.

slide-4
SLIDE 4

Improving String Processing for Temporal Relations Introduction

Introduction

  • Q. What’s this talk about?
  • A. Representing temporal information compactly for reasoning

and processing. Example “John slept through the fire alarm last Tuesday.” lt lt, js lt, js, fa lt, js lt

slide-5
SLIDE 5

Improving String Processing for Temporal Relations Introduction

Motivation

Ordering events and times is crucial in natural language understanding.

slide-6
SLIDE 6

Improving String Processing for Temporal Relations Introduction

Motivation

Ordering events and times is crucial in natural language understanding. Providing a useful way to both viualise a document’s temporal structure, and perform inference with the same framework.

slide-7
SLIDE 7

Improving String Processing for Temporal Relations Introduction

Motivation

Ordering events and times is crucial in natural language understanding. Providing a useful way to both viualise a document’s temporal structure, and perform inference with the same framework. Assisted annotation for TimeML or similar schemas.

slide-8
SLIDE 8

Improving String Processing for Temporal Relations Introduction TimeML

ISO-TimeML

What is TimeML?

slide-9
SLIDE 9

Improving String Processing for Temporal Relations Introduction TimeML

ISO-TimeML

What is ISO-TimeML?

slide-10
SLIDE 10

Improving String Processing for Temporal Relations Introduction TimeML

ISO-TimeML

What is ISO-TimeML? An ISO standard markup language for annotating temporal information in texts.

slide-11
SLIDE 11

Improving String Processing for Temporal Relations Introduction TimeML

Why TimeML?

Widely known, and ISO standard. The TLINK elements map well to Allen’s interval relations. The TimeBank corpus (183 documents manually annotated with TimeML).

slide-12
SLIDE 12

Improving String Processing for Temporal Relations Introduction TimeML

TimeML Example

Example John taught 20 minutes every Monday.

Example from http://www.timeml.org/publications/timeMLdocs/timeml_1.2.1.html

slide-13
SLIDE 13

Improving String Processing for Temporal Relations Introduction TimeML

TimeML Example

Example John taught 20 minutes every Monday.

Example from http://www.timeml.org/publications/timeMLdocs/timeml_1.2.1.html

slide-14
SLIDE 14

Improving String Processing for Temporal Relations Introduction TimeML

TLINKs Example

Example John taught 20 minutes every Monday. <TLINK timeID="t1" relatedToTime="t2" relType="IS INCLUDED"/> <TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/>

slide-15
SLIDE 15

Improving String Processing for Temporal Relations Introduction TimeML

TLINKs Example

Example John taught 20 minutes every Monday. <TLINK timeID="t1" relatedToTime="t2" relType="IS INCLUDED"/> <TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/> Several ways to represent this information.

slide-16
SLIDE 16

Improving String Processing for Temporal Relations Introduction TimeML

TLINKs in a Directed Graph

Example John taught 20 minutes every Monday.

slide-17
SLIDE 17

Improving String Processing for Temporal Relations Introduction TimeML

TLINKs using T-BOX

Example John taught 20 minutes every Monday.

slide-18
SLIDE 18

Improving String Processing for Temporal Relations Introduction TimeML

TLINKs as Strings

Example John taught 20 minutes every Monday. t2 t1, t2 ei1, t1, t2 t1, t2 t2

slide-19
SLIDE 19

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

Sequences of Sets-as-Symbols

Strings of n sets α1α2 · · · αn are used to represent n moments

  • f time.
slide-20
SLIDE 20

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

Sequences of Sets-as-Symbols

Strings of n sets α1α2 · · · αn are used to represent n moments

  • f time.

Each set αi contains exactly those fluents (temporal propositions, treated as intervals) which are true at moment i. e.g. {a}{a, b}{b}

slide-21
SLIDE 21

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

Sequences of Sets-as-Symbols

Strings of n sets α1α2 · · · αn are used to represent n moments

  • f time.

Each set αi contains exactly those fluents (temporal propositions, treated as intervals) which are true at moment i. e.g. {a}{a, b}{b} Sets are drawn as boxes, so strings can be read like comic strips, snapshots of film, or timelines. e.g. a a, b b

slide-22
SLIDE 22

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

Comic Strips

Image from The National Archives UK

slide-23
SLIDE 23

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

An Intertial World

Note! A fluent occurring in multiple boxes does not imply a longer duration.

slide-24
SLIDE 24

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

An Intertial World

Note! A fluent occurring in multiple boxes does not imply a longer duration. For example, a is equivalent in interpretation to a a a , though the latter string is said to feature stutter.

slide-25
SLIDE 25

Improving String Processing for Temporal Relations Introduction Strings for Temporal Data

An Intertial World

Note! A fluent occurring in multiple boxes does not imply a longer duration. For example, a is equivalent in interpretation to a a a , though the latter string is said to feature stutter. We can remove stutter through a block-compression

  • peration:

b c(s) :=    s if length(s) ≤ 1 b c(αs′) if s = ααs′ α b c(α′s′) if s = αα′s′ with α = α′

slide-26
SLIDE 26

Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations

Allen Relations

Allen treats intervals as primitive for events, not start/end points. Allen’s Interval Relations form the basis of TimeML’s TLINK relation types. These are easily transformable to strings.

slide-27
SLIDE 27

Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations

Allen Relations as Strings

R a R a′ SR (a, a′) R−1 a R−1 a′ SR−1 (a, a′) < a before a′ a a′ > a after a′ a′ a m a meets a′ a a′ mi a met by a′ a′ a

  • a overlaps a′

a a, a′ a′

  • i

a overlapped by a′ a′ a′, a a d a during a′ a′ a, a′ a′ di a contains a′ a a′, a a s a starts a′ a, a′ a′ si a started by a′ a′, a a f a finishes a′ a′ a, a′ fi a finished by a′ a a′, a = a equals a′ a, a′

Allen’s interval relations as strings.

slide-28
SLIDE 28

Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations

Extraction of ARs as Strings from TimeML

https://www.scss.tcd.ie/~dwoods/isa14/

slide-29
SLIDE 29

Improving String Processing for Temporal Relations Introduction Superposition

Combining Strings

Superposition allows us to condense the information from multiple strings into a more compact form.

slide-30
SLIDE 30

Improving String Processing for Temporal Relations Introduction Superposition

Combining Strings

Superposition allows us to condense the information from multiple strings into a more compact form. The simplest version of the operation is just componentwise union of two strings of equal length: α1α2 · · · αn & α′

1α′ 2 · · · α′ n := (α1 ∪ α′ 1)(α2 ∪ α′ 2) · · · (αn ∪ α′ n)

slide-31
SLIDE 31

Improving String Processing for Temporal Relations Introduction Superposition

Combining Strings

Superposition allows us to condense the information from multiple strings into a more compact form. The simplest version of the operation is just componentwise union of two strings of equal length: α1α2 · · · αn & α′

1α′ 2 · · · α′ n := (α1 ∪ α′ 1)(α2 ∪ α′ 2) · · · (αn ∪ α′ n)

Example (Basic Superposition) a b c & a a d = a a, b c, d

slide-32
SLIDE 32

Improving String Processing for Temporal Relations Introduction Superposition

Asynchronous Superposition

We handle strings of unequal length by introducing stutter using an inverse block compression, such that the operands are padded to the point that basic superposition is possible.

slide-33
SLIDE 33

Improving String Processing for Temporal Relations Introduction Superposition

Asynchronous Superposition

We handle strings of unequal length by introducing stutter using an inverse block compression, such that the operands are padded to the point that basic superposition is possible. Noting that every string s′ ∈ b c−1b c(s) is equivalent under compression, we define asynchronous superposition: s &∗ s′ := {b c(s′′) | s′′ ∈ b c−1b c(s) & b c−1b c(s′)}

slide-34
SLIDE 34

Improving String Processing for Temporal Relations Introduction Superposition

Asynchronous Superposition

Example (Asynchronous Superposition) x z &∗ x y z = { x x, y z , x x, y x, z z , x y, z z , x x, y y, z z , x x, z y, z z }

slide-35
SLIDE 35

Improving String Processing for Temporal Relations Introduction Superposition

Issues

Superposition doesn’t account for coreferent fluents.

slide-36
SLIDE 36

Improving String Processing for Temporal Relations Introduction Superposition

Issues

Superposition doesn’t account for coreferent fluents. This causes overgeneration of strings as a result.

slide-37
SLIDE 37

Improving String Processing for Temporal Relations Introduction Superposition

Issues

Superposition doesn’t account for coreferent fluents. This causes overgeneration of strings as a result. Each string produced must be examined and thrown out if invalid.

slide-38
SLIDE 38

Improving String Processing for Temporal Relations Introduction Superposition

Issues

For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ).

slide-39
SLIDE 39

Improving String Processing for Temporal Relations Introduction Superposition

Issues

For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ). The remaining 269 do not retain the original information that both x is before y, and y is before z.

slide-40
SLIDE 40

Improving String Processing for Temporal Relations Introduction Superposition

Issues

For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ). The remaining 269 do not retain the original information that both x is before y, and y is before z. This is costly! We can do better.

slide-41
SLIDE 41

Improving String Processing for Temporal Relations Improvements Notation

Notation

Some quick extra notation!

slide-42
SLIDE 42

Improving String Processing for Temporal Relations Improvements Notation

Vocabulary

The vocabulary of a string s will be said to be the union of each of its components: voc(s) :=

n

  • i=1

αi Example (“John slept through the fire alarm last Tuesday”) voc( lt lt, js lt, js, fa lt, js lt ) = {lt, js, fa} lt = Last Tuesday js = John sleeps fa = Fire alarm sounds

slide-43
SLIDE 43

Improving String Processing for Temporal Relations Improvements Notation

Reduct

For any set A, the A-reduct of a string s is defined as the componentwise intersection of s with A: ρA(α1α2 · · · αn) := (α1 ∩ A)(α2 ∩ A) · · · (αn ∩ A) Example ρ{lt,fa}( lt lt, js lt, js, fa lt, js lt ) = lt lt lt, fa lt lt

slide-44
SLIDE 44

Improving String Processing for Temporal Relations Improvements Projection

Projection

We’ll say that a string s projects to another string s′ if the voc(s)-reduct of s block compresses to s′: b c(ρvoc(s′)(s)) = s′

slide-45
SLIDE 45

Improving String Processing for Temporal Relations Improvements Projection

Reduction to Desired Information

We can use this idea of compression along with specified vocabulary to determine the Allen Relation between any two fluents in an arbitrary string.

slide-46
SLIDE 46

Improving String Processing for Temporal Relations Improvements Projection

Reduction to Desired Information

We can use this idea of compression along with specified vocabulary to determine the Allen Relation between any two fluents in an arbitrary string. Example lt lt, js lt, js, fa lt, js lt projects to the string lt fa, lt lt , which corresponds to “fa during lt”.

slide-47
SLIDE 47

Improving String Processing for Temporal Relations Improvements Projection

Projection-based Constraints

Additionally, we can say that the resulting string(s) of a superposition should project back to both of their input strings in order to retain the initial information. Example x y superposed with y z produces 270 strings, but of these, only x y z projects back to both of its “parents”.

slide-48
SLIDE 48

Improving String Processing for Temporal Relations Improvements Projection

Improving Superposition with Projection

By building the notion of projection into superposition, we can prevent invalid strings from being generated at all. No more need to test for inconsistency or loss of information after generation, testing is built-in. This vastly increases execution speed of the operation, and paves the way for handling more complex situations than we have seen so far.

slide-49
SLIDE 49

Improving String Processing for Temporal Relations Improvements Projection

Improving Superposition with Projection

Algorithm details are in the paper!

As high as 94% in certain cases!

slide-50
SLIDE 50

Improving String Processing for Temporal Relations Improvements Projection

Improving Superposition with Projection

Algorithm details are in the paper! Suffice to say that on average (for all pairs of strings which each contain two fluents), we saw a 58% speed up when going from the old generate-then-test approach to the new superposition which incorporates projection.

As high as 94% in certain cases!

slide-51
SLIDE 51

Improving String Processing for Temporal Relations Improvements Projection

Improving Superposition with Projection

Algorithm details are in the paper! Suffice to say that on average (for all pairs of strings which each contain two fluents), we saw a 58% speed up when going from the old generate-then-test approach to the new superposition which incorporates projection. Example x y &∗ y z takes 0.3207ms to calculate. x y &vc y z takes 0.0659ms to calculate. Decrease in time: 79.45%.

As high as 94% in certain cases!

slide-52
SLIDE 52

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Languages and Strings

Something to highlight here is that we have a certain amount

  • f flexibility between strings and languages.
slide-53
SLIDE 53

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Languages and Strings

Something to highlight here is that we have a certain amount

  • f flexibility between strings and languages.

Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings).

slide-54
SLIDE 54

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Languages and Strings

Something to highlight here is that we have a certain amount

  • f flexibility between strings and languages.

Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings). If the language contains just one string, we may conflate it with that string.

slide-55
SLIDE 55

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Languages and Strings

Something to highlight here is that we have a certain amount

  • f flexibility between strings and languages.

Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings). If the language contains just one string, we may conflate it with that string. However...

slide-56
SLIDE 56

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Composition of Allen Relations

From Allen (1983, p836, Fig. 4)

slide-57
SLIDE 57

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Incomplete Information

72 out 144 pairings of Allen relations result in a language with more than one string.

slide-58
SLIDE 58

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Incomplete Information

72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse.

slide-59
SLIDE 59

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Incomplete Information

72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse. Consider this example: Example “The girl stopped singing when the music on the radio ended.”

slide-60
SLIDE 60

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Incomplete Information

72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse. Consider this example: Example “The girl stopped singing when the music on the radio ended.” The singing and music end at the same time, but which began first?

slide-61
SLIDE 61

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Non-deterministic Superposition

As another example, take the superposition of a b and a c . A language of five strings result, each one featuring a different Allen relation between b and c: finished by, contains, meets, before, and overlaps. How should we handle this?

slide-62
SLIDE 62

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Freksa Relations

One possible solution lies in moving from intervals to semi-intervals. Freksa treats beginnings and endings of intervals as primitives which are themselves events (and so might also have beginnings and endings). Eighteen new relations are described, based on constraints between beginnings and endings. These correspond to disjunctions and conjunctions of Allen Relations. In fact, the example on the previous slide (finished by, contains, meets, before, and overlaps) corresponds to the Freksa relation b is older than c.

slide-63
SLIDE 63

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Bounding with pre and post

We introduce pre and post to our strings: pre(a) occurs exclusively to the left of a and post(a) occurs exclusively to the right of a. pre(a) a post(a) .

slide-64
SLIDE 64

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Bounding with pre and post

We introduce pre and post to our strings: pre(a) occurs exclusively to the left of a and post(a) occurs exclusively to the right of a. pre(a) a post(a) . This allows us to describe strings featuring incomplete knowledge: Example (“The girl stopped singing when the music on the radio ended.”) post(gs), post(rm)

gs = Girl singing rm = Radio plays music

slide-65
SLIDE 65

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Projecting to Freksa Relations

This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation.

slide-66
SLIDE 66

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Projecting to Freksa Relations

This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation. For example, the five relations corresponding to a older than b all project to the string pre(a), pre(b) pre(b) .

slide-67
SLIDE 67

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Projecting to Freksa Relations

This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation. For example, the five relations corresponding to a older than b all project to the string pre(a), pre(b) pre(b) . This is similar to a string corresponding to a before b if it projects to a b .

slide-68
SLIDE 68

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Freksa Strings

Below are a few of the Freksa relations represented as single strings. Freksa relation Projected string a ol “older” b pre(a), pre(b) pre(b) a yo “younger” b pre(a), pre(b) pre(a) a sb “survived by” b post(a) post(a), post(b) a sv “survives” b post(b) post(a), post(b) a hh “head to head with” b pre(a), pre(b) a tt “tail to tail with” b post(a), post(b) a bd “born before death of” b pre(a) post(b) a db “died after birth of” b pre(b) post(a)

slide-69
SLIDE 69

Improving String Processing for Temporal Relations Improvements Semi-Intervals

Not a Perfect Solution

The remaining Freksa relations are not so simple to represent – they correspond to conjunctions and disjunctions of constraints between endpoints. Further, the ? relation has no constraints, so how should we handle this? Consider a b d superposed with a c – this produces a language of 25 strings. We can infer b is older than c without too much difficulty, but there is no deterministic way to ascertain the relation between c and d.

slide-70
SLIDE 70

Improving String Processing for Temporal Relations Conclusion

Summary

Introduced (for some!) a framework which uses strings as finite models to represent the sequences of times and events which exist in TimeML. Described improvements to the superposition operation for collecting temporal information into a compact representation which maintains consistent inter-relations by default. Proposed method of treating incomplete data using the concept of semi-intervals as primitive (or co-primitive).

slide-71
SLIDE 71

Improving String Processing for Temporal Relations Conclusion

Further work

It’s still difficult to reduce to the “preferred” single-string model in some scenarios. Still useful to have the information in a string format. Consistency comes “for free” from superposition.

slide-72
SLIDE 72

Improving String Processing for Temporal Relations Conclusion

Further work

Ways of handling cases of non-singleton languages. Set of maximal substrings (don’t superpose if there’s no gain). Probabilistic approach: in the case when more than 13 strings are generated, examine which Allen/Freksa relation appears most frequently among them. Potentially offer an annotator a confidence score in a particular string.

slide-73
SLIDE 73

Improving String Processing for Temporal Relations Conclusion

Further work

Create further tooling for annotation software that may act as complementary to existing aids. Explore other applications (e.g. multi-document auto-summarisation using timelines).

slide-74
SLIDE 74

Improving String Processing for Temporal Relations Conclusion

Superposition Sandbox

https://www.scss.tcd.ie/~dwoods/star/

slide-75
SLIDE 75

Improving String Processing for Temporal Relations Conclusion

Acknowledgements

This research is supported by Science Foundation Ireland (SFI) through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (https://www.adaptcentre.ie) at Trinity College Dublin. The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund. Thank you for listening! Please send questions to dwoods@tcd.ie