Improving String Processing for Temporal Relations
Improving String Processing for Temporal Relations Tim Fernando - - PowerPoint PPT Presentation
Improving String Processing for Temporal Relations Tim Fernando - - PowerPoint PPT Presentation
Improving String Processing for Temporal Relations Improving String Processing for Temporal Relations Tim Fernando David Woods ADAPT Centre Computational Linguistics Group Trinity Centre for Computing and Language Studies School of Computer
Improving String Processing for Temporal Relations Introduction
Introduction
- Q. What’s this talk about?
Improving String Processing for Temporal Relations Introduction
Introduction
- Q. What’s this talk about?
- A. Representing temporal information compactly for reasoning
and processing.
Improving String Processing for Temporal Relations Introduction
Introduction
- Q. What’s this talk about?
- A. Representing temporal information compactly for reasoning
and processing. Example “John slept through the fire alarm last Tuesday.” lt lt, js lt, js, fa lt, js lt
Improving String Processing for Temporal Relations Introduction
Motivation
Ordering events and times is crucial in natural language understanding.
Improving String Processing for Temporal Relations Introduction
Motivation
Ordering events and times is crucial in natural language understanding. Providing a useful way to both viualise a document’s temporal structure, and perform inference with the same framework.
Improving String Processing for Temporal Relations Introduction
Motivation
Ordering events and times is crucial in natural language understanding. Providing a useful way to both viualise a document’s temporal structure, and perform inference with the same framework. Assisted annotation for TimeML or similar schemas.
Improving String Processing for Temporal Relations Introduction TimeML
ISO-TimeML
What is TimeML?
Improving String Processing for Temporal Relations Introduction TimeML
ISO-TimeML
What is ISO-TimeML?
Improving String Processing for Temporal Relations Introduction TimeML
ISO-TimeML
What is ISO-TimeML? An ISO standard markup language for annotating temporal information in texts.
Improving String Processing for Temporal Relations Introduction TimeML
Why TimeML?
Widely known, and ISO standard. The TLINK elements map well to Allen’s interval relations. The TimeBank corpus (183 documents manually annotated with TimeML).
Improving String Processing for Temporal Relations Introduction TimeML
TimeML Example
Example John taught 20 minutes every Monday.
Example from http://www.timeml.org/publications/timeMLdocs/timeml_1.2.1.html
Improving String Processing for Temporal Relations Introduction TimeML
TimeML Example
Example John taught 20 minutes every Monday.
Example from http://www.timeml.org/publications/timeMLdocs/timeml_1.2.1.html
Improving String Processing for Temporal Relations Introduction TimeML
TLINKs Example
Example John taught 20 minutes every Monday. <TLINK timeID="t1" relatedToTime="t2" relType="IS INCLUDED"/> <TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/>
Improving String Processing for Temporal Relations Introduction TimeML
TLINKs Example
Example John taught 20 minutes every Monday. <TLINK timeID="t1" relatedToTime="t2" relType="IS INCLUDED"/> <TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/> Several ways to represent this information.
Improving String Processing for Temporal Relations Introduction TimeML
TLINKs in a Directed Graph
Example John taught 20 minutes every Monday.
Improving String Processing for Temporal Relations Introduction TimeML
TLINKs using T-BOX
Example John taught 20 minutes every Monday.
Improving String Processing for Temporal Relations Introduction TimeML
TLINKs as Strings
Example John taught 20 minutes every Monday. t2 t1, t2 ei1, t1, t2 t1, t2 t2
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
Sequences of Sets-as-Symbols
Strings of n sets α1α2 · · · αn are used to represent n moments
- f time.
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
Sequences of Sets-as-Symbols
Strings of n sets α1α2 · · · αn are used to represent n moments
- f time.
Each set αi contains exactly those fluents (temporal propositions, treated as intervals) which are true at moment i. e.g. {a}{a, b}{b}
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
Sequences of Sets-as-Symbols
Strings of n sets α1α2 · · · αn are used to represent n moments
- f time.
Each set αi contains exactly those fluents (temporal propositions, treated as intervals) which are true at moment i. e.g. {a}{a, b}{b} Sets are drawn as boxes, so strings can be read like comic strips, snapshots of film, or timelines. e.g. a a, b b
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
Comic Strips
Image from The National Archives UK
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
An Intertial World
Note! A fluent occurring in multiple boxes does not imply a longer duration.
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
An Intertial World
Note! A fluent occurring in multiple boxes does not imply a longer duration. For example, a is equivalent in interpretation to a a a , though the latter string is said to feature stutter.
Improving String Processing for Temporal Relations Introduction Strings for Temporal Data
An Intertial World
Note! A fluent occurring in multiple boxes does not imply a longer duration. For example, a is equivalent in interpretation to a a a , though the latter string is said to feature stutter. We can remove stutter through a block-compression
- peration:
b c(s) := s if length(s) ≤ 1 b c(αs′) if s = ααs′ α b c(α′s′) if s = αα′s′ with α = α′
Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations
Allen Relations
Allen treats intervals as primitive for events, not start/end points. Allen’s Interval Relations form the basis of TimeML’s TLINK relation types. These are easily transformable to strings.
Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations
Allen Relations as Strings
R a R a′ SR (a, a′) R−1 a R−1 a′ SR−1 (a, a′) < a before a′ a a′ > a after a′ a′ a m a meets a′ a a′ mi a met by a′ a′ a
- a overlaps a′
a a, a′ a′
- i
a overlapped by a′ a′ a′, a a d a during a′ a′ a, a′ a′ di a contains a′ a a′, a a s a starts a′ a, a′ a′ si a started by a′ a′, a a f a finishes a′ a′ a, a′ fi a finished by a′ a a′, a = a equals a′ a, a′
Allen’s interval relations as strings.
Improving String Processing for Temporal Relations Introduction Allen’s Interval Relations
Extraction of ARs as Strings from TimeML
https://www.scss.tcd.ie/~dwoods/isa14/
Improving String Processing for Temporal Relations Introduction Superposition
Combining Strings
Superposition allows us to condense the information from multiple strings into a more compact form.
Improving String Processing for Temporal Relations Introduction Superposition
Combining Strings
Superposition allows us to condense the information from multiple strings into a more compact form. The simplest version of the operation is just componentwise union of two strings of equal length: α1α2 · · · αn & α′
1α′ 2 · · · α′ n := (α1 ∪ α′ 1)(α2 ∪ α′ 2) · · · (αn ∪ α′ n)
Improving String Processing for Temporal Relations Introduction Superposition
Combining Strings
Superposition allows us to condense the information from multiple strings into a more compact form. The simplest version of the operation is just componentwise union of two strings of equal length: α1α2 · · · αn & α′
1α′ 2 · · · α′ n := (α1 ∪ α′ 1)(α2 ∪ α′ 2) · · · (αn ∪ α′ n)
Example (Basic Superposition) a b c & a a d = a a, b c, d
Improving String Processing for Temporal Relations Introduction Superposition
Asynchronous Superposition
We handle strings of unequal length by introducing stutter using an inverse block compression, such that the operands are padded to the point that basic superposition is possible.
Improving String Processing for Temporal Relations Introduction Superposition
Asynchronous Superposition
We handle strings of unequal length by introducing stutter using an inverse block compression, such that the operands are padded to the point that basic superposition is possible. Noting that every string s′ ∈ b c−1b c(s) is equivalent under compression, we define asynchronous superposition: s &∗ s′ := {b c(s′′) | s′′ ∈ b c−1b c(s) & b c−1b c(s′)}
Improving String Processing for Temporal Relations Introduction Superposition
Asynchronous Superposition
Example (Asynchronous Superposition) x z &∗ x y z = { x x, y z , x x, y x, z z , x y, z z , x x, y y, z z , x x, z y, z z }
Improving String Processing for Temporal Relations Introduction Superposition
Issues
Superposition doesn’t account for coreferent fluents.
Improving String Processing for Temporal Relations Introduction Superposition
Issues
Superposition doesn’t account for coreferent fluents. This causes overgeneration of strings as a result.
Improving String Processing for Temporal Relations Introduction Superposition
Issues
Superposition doesn’t account for coreferent fluents. This causes overgeneration of strings as a result. Each string produced must be examined and thrown out if invalid.
Improving String Processing for Temporal Relations Introduction Superposition
Issues
For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ).
Improving String Processing for Temporal Relations Introduction Superposition
Issues
For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ). The remaining 269 do not retain the original information that both x is before y, and y is before z.
Improving String Processing for Temporal Relations Introduction Superposition
Issues
For example, “x before y” – x y superposed with “y before z” – y z generates 270 strings, only one of which is correct ( x y z ). The remaining 269 do not retain the original information that both x is before y, and y is before z. This is costly! We can do better.
Improving String Processing for Temporal Relations Improvements Notation
Notation
Some quick extra notation!
Improving String Processing for Temporal Relations Improvements Notation
Vocabulary
The vocabulary of a string s will be said to be the union of each of its components: voc(s) :=
n
- i=1
αi Example (“John slept through the fire alarm last Tuesday”) voc( lt lt, js lt, js, fa lt, js lt ) = {lt, js, fa} lt = Last Tuesday js = John sleeps fa = Fire alarm sounds
Improving String Processing for Temporal Relations Improvements Notation
Reduct
For any set A, the A-reduct of a string s is defined as the componentwise intersection of s with A: ρA(α1α2 · · · αn) := (α1 ∩ A)(α2 ∩ A) · · · (αn ∩ A) Example ρ{lt,fa}( lt lt, js lt, js, fa lt, js lt ) = lt lt lt, fa lt lt
Improving String Processing for Temporal Relations Improvements Projection
Projection
We’ll say that a string s projects to another string s′ if the voc(s)-reduct of s block compresses to s′: b c(ρvoc(s′)(s)) = s′
Improving String Processing for Temporal Relations Improvements Projection
Reduction to Desired Information
We can use this idea of compression along with specified vocabulary to determine the Allen Relation between any two fluents in an arbitrary string.
Improving String Processing for Temporal Relations Improvements Projection
Reduction to Desired Information
We can use this idea of compression along with specified vocabulary to determine the Allen Relation between any two fluents in an arbitrary string. Example lt lt, js lt, js, fa lt, js lt projects to the string lt fa, lt lt , which corresponds to “fa during lt”.
Improving String Processing for Temporal Relations Improvements Projection
Projection-based Constraints
Additionally, we can say that the resulting string(s) of a superposition should project back to both of their input strings in order to retain the initial information. Example x y superposed with y z produces 270 strings, but of these, only x y z projects back to both of its “parents”.
Improving String Processing for Temporal Relations Improvements Projection
Improving Superposition with Projection
By building the notion of projection into superposition, we can prevent invalid strings from being generated at all. No more need to test for inconsistency or loss of information after generation, testing is built-in. This vastly increases execution speed of the operation, and paves the way for handling more complex situations than we have seen so far.
Improving String Processing for Temporal Relations Improvements Projection
Improving Superposition with Projection
Algorithm details are in the paper!
As high as 94% in certain cases!
Improving String Processing for Temporal Relations Improvements Projection
Improving Superposition with Projection
Algorithm details are in the paper! Suffice to say that on average (for all pairs of strings which each contain two fluents), we saw a 58% speed up when going from the old generate-then-test approach to the new superposition which incorporates projection.
As high as 94% in certain cases!
Improving String Processing for Temporal Relations Improvements Projection
Improving Superposition with Projection
Algorithm details are in the paper! Suffice to say that on average (for all pairs of strings which each contain two fluents), we saw a 58% speed up when going from the old generate-then-test approach to the new superposition which incorporates projection. Example x y &∗ y z takes 0.3207ms to calculate. x y &vc y z takes 0.0659ms to calculate. Decrease in time: 79.45%.
As high as 94% in certain cases!
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Languages and Strings
Something to highlight here is that we have a certain amount
- f flexibility between strings and languages.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Languages and Strings
Something to highlight here is that we have a certain amount
- f flexibility between strings and languages.
Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings).
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Languages and Strings
Something to highlight here is that we have a certain amount
- f flexibility between strings and languages.
Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings). If the language contains just one string, we may conflate it with that string.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Languages and Strings
Something to highlight here is that we have a certain amount
- f flexibility between strings and languages.
Our preferred model is a string, as it is simpler, but superposition generates languages (sets of strings). If the language contains just one string, we may conflate it with that string. However...
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Composition of Allen Relations
From Allen (1983, p836, Fig. 4)
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Incomplete Information
72 out 144 pairings of Allen relations result in a language with more than one string.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Incomplete Information
72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Incomplete Information
72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse. Consider this example: Example “The girl stopped singing when the music on the radio ended.”
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Incomplete Information
72 out 144 pairings of Allen relations result in a language with more than one string. Lack of total information is also frequent, both in the TimeBank corpus, and in natural discourse. Consider this example: Example “The girl stopped singing when the music on the radio ended.” The singing and music end at the same time, but which began first?
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Non-deterministic Superposition
As another example, take the superposition of a b and a c . A language of five strings result, each one featuring a different Allen relation between b and c: finished by, contains, meets, before, and overlaps. How should we handle this?
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Freksa Relations
One possible solution lies in moving from intervals to semi-intervals. Freksa treats beginnings and endings of intervals as primitives which are themselves events (and so might also have beginnings and endings). Eighteen new relations are described, based on constraints between beginnings and endings. These correspond to disjunctions and conjunctions of Allen Relations. In fact, the example on the previous slide (finished by, contains, meets, before, and overlaps) corresponds to the Freksa relation b is older than c.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Bounding with pre and post
We introduce pre and post to our strings: pre(a) occurs exclusively to the left of a and post(a) occurs exclusively to the right of a. pre(a) a post(a) .
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Bounding with pre and post
We introduce pre and post to our strings: pre(a) occurs exclusively to the left of a and post(a) occurs exclusively to the right of a. pre(a) a post(a) . This allows us to describe strings featuring incomplete knowledge: Example (“The girl stopped singing when the music on the radio ended.”) post(gs), post(rm)
gs = Girl singing rm = Radio plays music
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Projecting to Freksa Relations
This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation.
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Projecting to Freksa Relations
This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation. For example, the five relations corresponding to a older than b all project to the string pre(a), pre(b) pre(b) .
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Projecting to Freksa Relations
This also allows us to project from languages which feature a set of Allen relation corresponding to a Freksa relation, to a single string which is characteristic of that relation. For example, the five relations corresponding to a older than b all project to the string pre(a), pre(b) pre(b) . This is similar to a string corresponding to a before b if it projects to a b .
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Freksa Strings
Below are a few of the Freksa relations represented as single strings. Freksa relation Projected string a ol “older” b pre(a), pre(b) pre(b) a yo “younger” b pre(a), pre(b) pre(a) a sb “survived by” b post(a) post(a), post(b) a sv “survives” b post(b) post(a), post(b) a hh “head to head with” b pre(a), pre(b) a tt “tail to tail with” b post(a), post(b) a bd “born before death of” b pre(a) post(b) a db “died after birth of” b pre(b) post(a)
Improving String Processing for Temporal Relations Improvements Semi-Intervals
Not a Perfect Solution
The remaining Freksa relations are not so simple to represent – they correspond to conjunctions and disjunctions of constraints between endpoints. Further, the ? relation has no constraints, so how should we handle this? Consider a b d superposed with a c – this produces a language of 25 strings. We can infer b is older than c without too much difficulty, but there is no deterministic way to ascertain the relation between c and d.
Improving String Processing for Temporal Relations Conclusion
Summary
Introduced (for some!) a framework which uses strings as finite models to represent the sequences of times and events which exist in TimeML. Described improvements to the superposition operation for collecting temporal information into a compact representation which maintains consistent inter-relations by default. Proposed method of treating incomplete data using the concept of semi-intervals as primitive (or co-primitive).
Improving String Processing for Temporal Relations Conclusion
Further work
It’s still difficult to reduce to the “preferred” single-string model in some scenarios. Still useful to have the information in a string format. Consistency comes “for free” from superposition.
Improving String Processing for Temporal Relations Conclusion
Further work
Ways of handling cases of non-singleton languages. Set of maximal substrings (don’t superpose if there’s no gain). Probabilistic approach: in the case when more than 13 strings are generated, examine which Allen/Freksa relation appears most frequently among them. Potentially offer an annotator a confidence score in a particular string.
Improving String Processing for Temporal Relations Conclusion
Further work
Create further tooling for annotation software that may act as complementary to existing aids. Explore other applications (e.g. multi-document auto-summarisation using timelines).
Improving String Processing for Temporal Relations Conclusion
Superposition Sandbox
https://www.scss.tcd.ie/~dwoods/star/
Improving String Processing for Temporal Relations Conclusion