Temporal and Event Analysis of Natural Language Texts Siim Orasmaa - - PowerPoint PPT Presentation

temporal and event analysis of natural language texts
SMART_READER_LITE
LIVE PREVIEW

Temporal and Event Analysis of Natural Language Texts Siim Orasmaa - - PowerPoint PPT Presentation

Temporal and Event Analysis of Natural Language Texts Siim Orasmaa Data Estonian Reference Corpus of the University of Tartu Variety of text genres (news, popular science, legal texts, parliamentary transcripts) Automatically


slide-1
SLIDE 1

Temporal and Event Analysis

  • f Natural Language Texts

Siim Orasmaa

slide-2
SLIDE 2

Data

  • Estonian Reference Corpus of the University
  • f Tartu

– Variety of text genres (news, popular science,

legal texts, parliamentary transcripts)

– Automatically processed:

  • Sentence and clause boundaries detected
  • Morphological analysis provided
  • Robust temporal expressions annotation

– Based on TimeML annotation language

slide-3
SLIDE 3

An example of annotations

http://www.keeleveeb.ee

slide-4
SLIDE 4
  • I. Comparing documents by temporal

similarity

  • Given a newspaper article, find temporally similar

newspaper articles - articles that refer to

  • verlapping/similar time periods;
  • Task:

– Preprocess/index document collection – Implement a temporal similarity measure e.g Temporal Analysis of Document Collections: Framework and Applications, Alonso et al., 2010. – Add a text similarity measure e.g Exploiting Temporal References in Text Retrieval, Arikan, 2009.

slide-5
SLIDE 5
  • I. Comparing documents by temporal

similarity

  • Evaluation:

– Using roughly temporally parallel corpus

(newspaper articles from Eesti Päevaleht 1999 and Postimees 1999)

– Prepare some test data

  • How well can you detect documents

discussing same events?

  • How much the results depend on

newspaper article's category (News, Opinions, Sports, Economy etc)?

slide-6
SLIDE 6
  • II. Clustering temporal expression

contexts

  • More fine-grained approach: an event mention should

be located somewhere near the temporal expression (e.g a verb, noun or some phrase).

  • Task:

– Use an unsupervised algorithm to cluster

temporal expression contexts, e.g like in Word Sense Induction.

e.g Unsupervised corpus-based methods for WSD, Pedersen, 2006.C

– Can you detect some broad event classes? – Test the algortihm on different text genres.

slide-7
SLIDE 7
  • II. Clustering temporal expression

contexts

  • Discussion:

– Can you propose a meaningful labeling for

found clusters?

– Can you draw parallels between found clusters

and proposed event classifications (e.g the

  • ne in TimeML)?

– Does the clustering help to organize temporal

expressions for information retrieval?