Improving Human Text Comprehension through Semi-Markov CRF-based - - PowerPoint PPT Presentation

improving human text comprehension through semi markov
SMART_READER_LITE
LIVE PREVIEW

Improving Human Text Comprehension through Semi-Markov CRF-based - - PowerPoint PPT Presentation

Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation Sebastian Gehrmann, Steven Layne, Franck Dernoncourt @SebGehr gehrmann@seas.harvard.edu Long texts are hard to comprehend When another old cave is


slide-1
SLIDE 1

Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation

Sebastian Gehrmann, Steven Layne, Franck Dernoncourt

@SebGehr gehrmann@seas.harvard.edu

slide-2
SLIDE 2

Long texts are hard to comprehend

When another old cave is discovered in the south of France, it is not usually news. Rather, it is an ordinary event. Such discoveries are so frequent these days that hardly anybody pays heed to them. However, when the Lascaux cave complex was discovered in 1940, the world was

  • amazed. Painted directly on its walls were hundreds of scenes showing how people lived

thousands of years ago. The scenes show people hunting animals, such as bison or wild cats. Other images depict birds and, most noticeably, horses, which appear in more than 300 wall images, by far outnumbering all other animals. Early artists drawing these animals accomplished a monumental and difficult task. They did not limit themselves to the easily accessible walls but carried their painting materials to spaces that required climbing steep walls or crawling into narrow passages in the Lascaux complex. Unfortunately, the paintings have been exposed to the destructive action of water and temperature changes, which easily wear the images away. Because the Lascaux caves have many entrances, air movement has also damaged the images inside. Although they are not out in the open air, where natural light would have destroyed them long ago, many of the images have deteriorated and are barely recognizable. To prevent further damage, the site was closed to tourists in 1963, 23 years after it was discovered.

slide-3
SLIDE 3

Can Summaries Help?

When another old cave is discovered in the south of France, it is not usually news. Rather, it is an ordinary event. Such discoveries are so frequent these days that hardly anybody pays heed to them. However, when the Lascaux cave complex was discovered in 1940, the world was

  • amazed. Painted directly on its walls were hundreds of scenes showing how people lived

thousands of years ago. The scenes show people hunting animals, such as bison or wild cats. Other images depict birds and, most noticeably, horses, which appear in more than 300 wall images, by far outnumbering all other animals. Early artists drawing these animals accomplished a monumental and difficult task. They did not limit themselves to the easily accessible walls but carried their painting materials to spaces that required climbing steep walls or crawling into narrow passages in the Lascaux complex. Unfortunately, the paintings have been exposed to the destructive action of water and temperature changes, which easily wear the images away. Because the Lascaux caves have many entrances, air movement has also damaged the images inside. Although they are not out in the open air, where natural light would have destroyed them long ago, many of the images have deteriorated and are barely recognizable. To prevent further damage, the site was closed to tourists in 1963, 23 years after it was discovered.

Lascaux cave complex discovered Paintings exposed to destructive action Site closed to tourists

slide-4
SLIDE 4

We hypothesize that summaries, presented as section titles, can improve the fact retention, fact retrieval and text comprehension.

Dooling et al. (1971), Kintsch et al. (1978), Smith et al. (1992)

Goals (1) A low-resource approach to generating section titles (2) An evaluation framework for comprehension tasks

slide-5
SLIDE 5

Section Titles in Two Steps

argmax

sent ∈ para saliency(sent)

<latexit sha1_base64="4BueSHlBjyT+AG9b7qJ7aMVRUNA=">ACN3icbVBNSwMxEM36bf2qevQSLIJeyq4KehS9eBIF2wrdUrLptAaT7JLMSsuy/iov/g1vevGgiFf/gWm7B78GAm/em8dkXpRIYdH3n7yJyanpmdm5+dLC4tLySnl1rW7j1HCo8VjG5ipiFqTQUEOBEq4SA0xFEhrRzclQb9yCsSLWlzhIoKVYT4u4Awd1S6fhanuOB0wCxH6mFnQmN+FQt+N+4QZlueFyExPsX6eh3TUG5VZJgVoPsi3h8adrniV/1R0b8gKECFHXeLj+GnZinypm5ZNY2Az/BluEgkvIS2FqIWH8hvWg6aBmCmwrG92d0y3HdGg3Nu5pCP2uyNjytqBitykYnhtf2tD8j+tmWL3sJUJnaTojhsv6qaSYkyHIdKOMBRDhxg3Aj3V8qvXVAcXZQlF0Lw+S/oL5bDfaquxf7laPjIo45skE2yTYJyAE5IqfknNQIJ/fkmbySN+/Be/HevY/x6IRXeNbJj/I+vwDIx7Cg</latexit>

xi∀(x1, y1), . . . , (xn, yn) iff yi = 1

<latexit sha1_base64="XsZmpryMYguDk2Fkozp56rQsf78=">ACJHicbVDLSsNAFJ3UV62vqks3g0VQKCWpgoIohuXFawVmhIm0k7dDIJMzfSEOq/uPFX3LjwgQs3fovTmoVaDwczrmHuf4seAabPvDKszMzs0vFBdLS8srq2vl9Y1rHSWKsiaNRKRufKZ4JI1gYNgN7FiJPQFa/mD87HfumVK80heQRqzTkh6kgecEjCSVz4ehy7QaSIEHh36DnV1HP2qtgV3Qh0dSxJI8m9OxfYEDIeBKO71GROsOVK3bNngBPEycnFZSj4ZVf3W5Ek5BJoIJo3XbsGDoZUcCpYKOSm2gWEzogPdY2VJKQ6U42OXKEd4zSxWZT8yTgifozkZFQ6zT0zWRIoK/emPxP6+dQHDUybiME2CSfn8UJAJDhMeN4S5XjIJIDSFUcbMrpn2iCAXTa8mU4Pw9eZpc12vOfq1+eVA5PcvrKItI12kYMO0Sm6QA3URBTdo0f0jF6sB+vJerPev0cLVp7ZRL9gfX4BF0SjQg=</latexit>

(1) Select the most important Sentence (2) Compress the sentence

When another old cave is discovered in the south of France, it is not usually news. Rather, it is an ordinary event. Such discoveries are so frequent these days that hardly anybody pays heed to them. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Painted directly on its walls were hundreds of scenes showing how people lived thousands of years ago. The scenes show people hunting animals, such as bison or wild cats. Other images depict birds and, most noticeably, horses, which appear in more than 300 wall images, by far outnumbering all other animals. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Lascaux cave complex discovered

slide-6
SLIDE 6

Section Titles in Two Steps

argmax

sent ∈ para saliency(sent)

<latexit sha1_base64="4BueSHlBjyT+AG9b7qJ7aMVRUNA=">ACN3icbVBNSwMxEM36bf2qevQSLIJeyq4KehS9eBIF2wrdUrLptAaT7JLMSsuy/iov/g1vevGgiFf/gWm7B78GAm/em8dkXpRIYdH3n7yJyanpmdm5+dLC4tLySnl1rW7j1HCo8VjG5ipiFqTQUEOBEq4SA0xFEhrRzclQb9yCsSLWlzhIoKVYT4u4Awd1S6fhanuOB0wCxH6mFnQmN+FQt+N+4QZlueFyExPsX6eh3TUG5VZJgVoPsi3h8adrniV/1R0b8gKECFHXeLj+GnZinypm5ZNY2Az/BluEgkvIS2FqIWH8hvWg6aBmCmwrG92d0y3HdGg3Nu5pCP2uyNjytqBitykYnhtf2tD8j+tmWL3sJUJnaTojhsv6qaSYkyHIdKOMBRDhxg3Aj3V8qvXVAcXZQlF0Lw+S/oL5bDfaquxf7laPjIo45skE2yTYJyAE5IqfknNQIJ/fkmbySN+/Be/HevY/x6IRXeNbJj/I+vwDIx7Cg</latexit>

xi∀(x1, y1), . . . , (xn, yn) iff yi = 1

<latexit sha1_base64="XsZmpryMYguDk2Fkozp56rQsf78=">ACJHicbVDLSsNAFJ3UV62vqks3g0VQKCWpgoIohuXFawVmhIm0k7dDIJMzfSEOq/uPFX3LjwgQs3fovTmoVaDwczrmHuf4seAabPvDKszMzs0vFBdLS8srq2vl9Y1rHSWKsiaNRKRufKZ4JI1gYNgN7FiJPQFa/mD87HfumVK80heQRqzTkh6kgecEjCSVz4ehy7QaSIEHh36DnV1HP2qtgV3Qh0dSxJI8m9OxfYEDIeBKO71GROsOVK3bNngBPEycnFZSj4ZVf3W5Ek5BJoIJo3XbsGDoZUcCpYKOSm2gWEzogPdY2VJKQ6U42OXKEd4zSxWZT8yTgifozkZFQ6zT0zWRIoK/emPxP6+dQHDUybiME2CSfn8UJAJDhMeN4S5XjIJIDSFUcbMrpn2iCAXTa8mU4Pw9eZpc12vOfq1+eVA5PcvrKItI12kYMO0Sm6QA3URBTdo0f0jF6sB+vJerPev0cLVp7ZRL9gfX4BF0SjQg=</latexit>

(1) Select the most important Sentence (2) Compress the sentence

When another old cave is discovered in the south of France, it is not usually news. Rather, it is an ordinary event. Such discoveries are so frequent these days that hardly anybody pays heed to them. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Painted directly on its walls were hundreds of scenes showing how people lived thousands of years ago. The scenes show people hunting animals, such as bison or wild cats. Other images depict birds and, most noticeably, horses, which appear in more than 300 wall images, by far outnumbering all other animals. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Lascaux cave complex discovered

slide-7
SLIDE 7

Section Titles in Two Steps

argmax

sent ∈ para saliency(sent)

<latexit sha1_base64="4BueSHlBjyT+AG9b7qJ7aMVRUNA=">ACN3icbVBNSwMxEM36bf2qevQSLIJeyq4KehS9eBIF2wrdUrLptAaT7JLMSsuy/iov/g1vevGgiFf/gWm7B78GAm/em8dkXpRIYdH3n7yJyanpmdm5+dLC4tLySnl1rW7j1HCo8VjG5ipiFqTQUEOBEq4SA0xFEhrRzclQb9yCsSLWlzhIoKVYT4u4Awd1S6fhanuOB0wCxH6mFnQmN+FQt+N+4QZlueFyExPsX6eh3TUG5VZJgVoPsi3h8adrniV/1R0b8gKECFHXeLj+GnZinypm5ZNY2Az/BluEgkvIS2FqIWH8hvWg6aBmCmwrG92d0y3HdGg3Nu5pCP2uyNjytqBitykYnhtf2tD8j+tmWL3sJUJnaTojhsv6qaSYkyHIdKOMBRDhxg3Aj3V8qvXVAcXZQlF0Lw+S/oL5bDfaquxf7laPjIo45skE2yTYJyAE5IqfknNQIJ/fkmbySN+/Be/HevY/x6IRXeNbJj/I+vwDIx7Cg</latexit>

xi∀(x1, y1), . . . , (xn, yn) iff yi = 1

<latexit sha1_base64="XsZmpryMYguDk2Fkozp56rQsf78=">ACJHicbVDLSsNAFJ3UV62vqks3g0VQKCWpgoIohuXFawVmhIm0k7dDIJMzfSEOq/uPFX3LjwgQs3fovTmoVaDwczrmHuf4seAabPvDKszMzs0vFBdLS8srq2vl9Y1rHSWKsiaNRKRufKZ4JI1gYNgN7FiJPQFa/mD87HfumVK80heQRqzTkh6kgecEjCSVz4ehy7QaSIEHh36DnV1HP2qtgV3Qh0dSxJI8m9OxfYEDIeBKO71GROsOVK3bNngBPEycnFZSj4ZVf3W5Ek5BJoIJo3XbsGDoZUcCpYKOSm2gWEzogPdY2VJKQ6U42OXKEd4zSxWZT8yTgifozkZFQ6zT0zWRIoK/emPxP6+dQHDUybiME2CSfn8UJAJDhMeN4S5XjIJIDSFUcbMrpn2iCAXTa8mU4Pw9eZpc12vOfq1+eVA5PcvrKItI12kYMO0Sm6QA3URBTdo0f0jF6sB+vJerPev0cLVp7ZRL9gfX4BF0SjQg=</latexit>

(1) Select the most important Sentence (2) Compress the sentence

When another old cave is discovered in the south of France, it is not usually news. Rather, it is an ordinary event. Such discoveries are so frequent these days that hardly anybody pays heed to them. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Painted directly on its walls were hundreds of scenes showing how people lived thousands of years ago. The scenes show people hunting animals, such as bison or wild cats. Other images depict birds and, most noticeably, horses, which appear in more than 300 wall images, by far outnumbering all other animals. However, when the Lascaux cave complex was discovered in 1940, the world was amazed. Lascaux cave complex discovered

slide-8
SLIDE 8

Compressive Deletion

However, when the Lascaux cave complex was discovered in 1940, the world was amazed. However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

1 1 1 1

y : x :

slide-9
SLIDE 9

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

slide-10
SLIDE 10

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

slide-11
SLIDE 11

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

slide-12
SLIDE 12

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

slide-13
SLIDE 13

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

slide-14
SLIDE 14

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

ϕEmission(x, ⟨y, start, end⟩) =

end

i=start

WT

E [hi, hend − hstart, embend−start]

slide-15
SLIDE 15

Data-Efficiency through SCRFs

However, when the Lascaux cave complex was discovered in 1940, the world was amazed.

ϕEmission(x, ⟨y, start, end⟩) =

end

i=start

WT

E [hi, hend − hstart, embend−start]

ϕEmission(x, ⟨y,3,5⟩) =

5

i=3

WT

E [hi, hcomplex − hLascaux, emb2]

slide-16
SLIDE 16

Comparison to S2S

On 200,000 data points

SP after Filippova et al. (2015)

slide-17
SLIDE 17

Comparison to S2S

Sequential Pointer w/ features SCRF + Features (POS, NER…) + LM reranking 60 67.5 75 82.5 90

On 200,000 data points

SP after Filippova et al. (2015)

slide-18
SLIDE 18

Comparison to S2S

Sequential Pointer w/ features SCRF + Features (POS, NER…) + LM reranking 60 67.5 75 82.5 90

On 200,000 data points

SP after Filippova et al. (2015)

slide-19
SLIDE 19

Comparison to S2S

Sequential Pointer w/ features SCRF + Features (POS, NER…) + LM reranking 60 67.5 75 82.5 90

On 200,000 data points

SP after Filippova et al. (2015)

slide-20
SLIDE 20

Comparison to S2S

On limited data SCRF gains

slide-21
SLIDE 21

But does it improve comprehension?

National Geographic interactive reading practice:
 33 texts, 4-7 paragraphs, two reading difficulties, various topics

We compare no titles, human-written titles, and generated titles 144 participants completed six 2-3 min long tasks

slide-22
SLIDE 22

Retention


Ask people to answer questions after reading

slide-23
SLIDE 23

Retrieval


Ask people to find information

slide-24
SLIDE 24

Comprehension
 Ask people to summarize the text

slide-25
SLIDE 25

Retrieval and Retention

Accuracy Time Taken Baseline (no titles) Human-written

  • 0.01
  • 2.2 secs

Generated

  • 0.01
  • 27.1 secs (p < 0.01)
slide-26
SLIDE 26

Comprehension

Readability Relevance Length Time Taken Baseline 
 (no titles) 4.66 ± 0.65 4.11 ± 0.86 Human-written 4.55 ± 0.76 4.09 ± 0.95 +8.6 words*

  • 20.9 secs*

Generated 4.52 ± 0.72 4.12 ± 1.02 +5.3 words*

  • 2.6 secs*
slide-27
SLIDE 27

Section titles help with text comprehension The type of title influences what is remembered about a text 
 Extractive (Generated) titles: Fact retention is easier
 Human-written titles: The overall story is easier to understand

Schallert (1975), Kozminsky (1977), Lorch Jr (2011)

slide-28
SLIDE 28

Conclusion

  • We introduced a data-efficient title-generation pipeline
  • We found that the SCRF-based compression 

  • utperforms S2S models in low-data settings
  • We developed an evaluation framework and confirmed

the positive effect of titles on text comprehension

But….

  • The deletion-based compression only works for languages that

retain grammaticality, even English has problems at times

  • The efficacy of the low-resource model in an interface is still

unknown

slide-29
SLIDE 29

Generated Section Titles Improve Text Comprehension

Sebastian Gehrmann, Steven Layne, Franck Dernoncourt

@SebGehr gehrmann@seas.harvard.edu

slide-30
SLIDE 30

Sentence Selection through
 Word-Level Extractive Summarization

Gehrmann et al. (2018)

x = x1, …, xn y = y1, …, ym

extraction function

t = t1, …, tn

Objective: Learn Selection: Pick

log p(t|x) =

n

i=1

log p(ti|x) saliency(sent) = 1 |sent|

|sent|

i=1

p(ti|sent)

source document summary