Aktuelle Themen der Angewandten Informatik
Semantische Technologien
(M-TANI)
Christian Chiarcos Angewandte Computerlinguistik chiarcos@informatik.uni-frankfurt.de
- 18. Juli 2013
Semantische Technologien (M-TANI) Christian Chiarcos Angewandte - - PowerPoint PPT Presentation
Aktuelle Themen der Angewandten Informatik Semantische Technologien (M-TANI) Christian Chiarcos Angewandte Computerlinguistik chiarcos@informatik.uni-frankfurt.de 18. Juli 2013 Global coherence: Discourse Motivation & Theory
Rohit Kate (2010)
Rohit Kate (2010)
(1) (2) (3) (1) (2) (2a) (3)
Meaning in context
Sound waves Phonetics Words Syntactic processing Parses Semantic processing Meaning Discourse processing
computers may process multiple stages simultaneously Rohit Kate (2010) * NLP folks
* contrast with the implicit assumption that John pushed Max by intention.
Joshi et al. (2006)
rising 62% to $278 million. Operating profit dropped 35%, however, to $3.8 million.
Joshi et al. (2006)
company had substantially lower extraordinary charges to account for a restructuring program. (… 9 sentences …) Sales, however, were little changed at 2.46 billion guilders, compared with 2.42 billion guilders.
Joshi et al. (2006)
Taboada & Stede (2009), Rohit Kate (2010)
Occasion Explanation Parallel Explanation
John went to the bank to deposit his paycheck. He then took a train to Bill’s car dealership. He needed to buy a car. The company he works for now isn’t near any public transportation. He also wanted to talk to Bil l about their softball league.
Rohit Kate (2010)
Information Sciences Institute (www.isi.edu)
– Mann, William C. and Sandra A. Thompson. (1988). Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8 (3), 243-281.
– Taboada, Maite and William C. Mann. (2006). Rhetorical Structure Theory: Looking back and moving ahead. Discourse Studies, 8 (3), 423-459.
RST web site
– http://www.sfu.ca/rst/ – http://www.sfu.ca/rst/05bibliographies/ Taboada & Stede (2009)
Taboada & Stede (2009)
Taboada & Stede (2009)
bowl 4. and sprinkle with rum and coconut. 5. Chill until ready to serve.
Taboada & Stede (2009)
relation
– Nucleus (spans 2-3) made up of two spans in an Antithesis relation Taboada & Stede (2009)
Taboada & Stede (2009)
Joshi et al. (2006)
– The reader may not believe N to a degree satisfactory to the writer
– The reader believes S or will find it credible
– The reader’s comprehending S increases their belief of N
– The reader’s belief of N is increased
to spoken language discussed later
web site (www.sfu.ca/rst)
Taboada & Stede (2009)
Taboada & Stede (2009)
– Condition, Cause, Result
– Summary, Elaboration
– Concession (although, however); Condition (if, in case)
– Background, Restatement, Interpretation
– Elaboration – usually first the nucleus (material being elaborated on) and then satellite (extra information)
– Concession – usually the satellite (the although-type clause or span) before the nucleus
Taboada & Stede (2009)
Circumstance Antithesis and Concession Solutionhood Antithesis Elaboration Concession Background Condition and Otherwise Enablement and Motivation Condition Enablement Otherwise Motivation Interpretation and Evaluation Evidence and Justify Interpretation Evidence Evaluation Justify Restatement and Summary Relations of Cause Restatement Volitional Cause Summary Non-Volitional Cause Other Relations Volitional Result Sequence Non-Volitional Result Contrast Purpose
Other classifications are possible, and longer and shorter lists have been proposed Taboada & Stede (2009)
span of text (possibly made up of further spans
nucleus or nuclei
relation, and the direction
Taboada & Stede (2009)
Taboada & Stede (2009)
then mark that relation r else u might be at the boundary of a higher-level relation. Look at relations holding between larger units (spans)
then update 𝑇 → 𝑇\{𝑣1, 𝑣2} ∪ 𝑣1∘2 with the unit 𝑣1∘2 as concatenation of 𝑣1, 𝑣2
Taboada & Stede (2009)
Taboada & Stede (2009)
Taboada & Stede (2009)
Taboada & Stede (2009)
Rohit Kate (2010)
Rohit Kate (2010)
Rohit Kate (2010)
Cohesion is not to be confused with coherence!
But coherence may be indicated by cohesion
Rohit Kate (2010)
Rohit Kate (2010)
Gap Rohit Kate (2010)
Gap Similarity Rohit Kate (2010)
a b c valley Rohit Kate (2010)
Rohit Kate (2010)
From (Hearst, 1994) Rohit Kate (2010)
Rohit Kate (2010)
Rohit Kate (2010)
Rohit Kate (2010)
Rohit Kate (2010)
Rohit Kate (2010)
(Asher 1993, Asher & Lascarides 2003)
(Discourse Representation Theory, Kamp 1982)
Hobbs (1978), Mann & Thompson (1987)
Polanyi (1985), Webber (1988)
x y e1 n Max(x) John(y) e1: push(x,y) e1 < n
1 Discourse segment (utterance) x variable (discourse referent) for Max y variable (discourse referent) for John e1 variable (event) described by the utterance n Reference time (present) unary predicates that represent noun attributes binary predicate that reflects the semantics of the verb the event precedes the present time
x y e1 n Max(x) John(y) e1: push(x,y) e1 < n
z e2 n e2: fall(z) e2 < n
x y e1 n Max(x) John(y) e1: push(x,y) e1 < n
z e2 n e2: fall(z) e2 < n z = y
x y e1 n Max(x) John(y) e1: push(x,y) e1 < n
anaphor resolution inferred discourse relations Result(1, 2) Narration(1, 2)
> defeasible inference, monotone inference (e.g., if a discourse connector signals the relation unambiguously)
segment can be attached to segment in context t
the event described in involves a pushing event with arguments x and y
the event described in involves a falling event of argument y
the discourse relation between and is a Result
segment can be attached to segment
the event described in is a pushing event with arguments x and y
the event described in involves a falling event of argument y
the discourse relation between and is a Result
segment can be attached to segment
the event described in is a pushing event with arguments x and y
the event described in is a falling event of argument y
the discourse relation between and is a Result
segment can be attached to segment
the event described in is a pushing event with arguments x and y
the event described in is a falling event of argument y
the discourse relation between and is a Result
http://wit.istc.cnr.it/stlab-tools/fred/
http://wit.istc.cnr.it/stlab-tools/fred/
Rohit Kate (2010)
(Halliday & Hasan 1976)
in spite of that, in that case, etc.
=> represented as a string of text in the preceding text
Joshi et al. (2006)
Joshi et al. (2006)
Joshi et al. (2006)
Explicit connectives are the lexical items that trigger discourse relations.
Congress hasn't lifted the ceiling on government debt.
viewers will be given a 900 number to call.
size of … industrial concerns to conserve resources and restrict the profits businessmen could make. As a result, industry operated out of small, expensive, highly inefficient industrial units.
Joshi et al. (2006)
Joshi et al. (2006)
because Congress hasn't lifted the ceiling on government debt.
budgets for this year, forecast revenue of $15 for each barrel of crude produced.
that buy huge amounts of land "not for their corporate use, but for resale at huge profit." … The Ministry of Finance, as a result, has proposed a series of measures that would restrict business investment in real estate even more tightly than restrictions aimed at individuals.
Joshi et al. (2006)
significance far exceeding what is involved in the particular case. They speak volumes about the state of our society at a given moment. It has always been so. Implicit=for example (exemplification) In the 1920s, a young schoolteacher, John T. Scopes, volunteered to be a guinea pig in a test case sponsored by the American Civil Liberties Union to challenge a ban on the teaching of evolution imposed by the Tennessee Legislature. The result was a world-famous trial exposing profound cultural conflicts in American life between the "smart set," whose spokesman was H.L. Mencken, and the religious fundamentalists, whom Mencken derided as benighted primitives. Few now recall the actual outcome: Scopes was convicted and fined $100, and his conviction was reversed on appeal because the fine was excessive under Tennessee law.
Joshi et al. (2006)
Congress hasn't lifted the ceiling on government debt.
Because real-estate purchases and leases are such major long-term commitments that most companies and individuals make these decisions
engineered male steriles doesn't automatically mean it would be simple to create hybrids in all crops. That's because pollination, while easy in corn because
the carrier is wind, is more complex and involves insects as carriers in crops such as
pollinate the plant," he said. Nevertheless, he said, he is negotiating with Plant
Genetic to acquire the technology to try breeding hybrid cotton.
Joshi et al. (2006)
visualized with Protégé 4.1, http://sourceforge.net/p/olia/code/45/tree/trunk/owl/experimental/discourse/PDTB.owl
– AltLex, EntRel, NoRel
reputation in the non-horticultural art world, often took gardens as its nominal subject. AltLex = (consequence) Mayhap this metaphorical connection made the BPC Fine Arts Committee think she had a literal green thumb.
Joshi et al. (2006)
Entertainment Inc., was named president of Capitol Records Inc., a unit of this entertainment concern. EntRel Mr. Milgrim succeeds David Berman, who resigned last month.
Total capital investment at the site could be as much as $400 million, according to Intel.
Joshi et al. (2006)
Joshi et al. (2006)
segment can be attached to segment
the event described in is a pushing event with arguments x and y
the event described in is a falling event of argument y
the discourse relation between and is a Result
want buy
look
forget
start
visit
PukWaC
want buy
look
forget
start
visit
PukWaC
* http://ec.europa.eu/enterprise/regulation/goods/mutrec_en.htm ** http://ec.europa.eu/enterprise/regulation/goods/mutrec_de.htm
Scenario-Specific Contingency Relationships with No Supervision, In Proc. 2010 IEEE Fourth International Conference on Semantic Computing, Pittsburgh, Pennsylvania.
Narrative Schemas and their Participants, In Proc ACL-IJCNLP 2009, Singapore, p. 602-610
Discourse Relations, In Proc. ACL 2012, Jeju, Korea, p. 213- 217