Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity - - PowerPoint PPT Presentation

semantic similarity
SMART_READER_LITE
LIVE PREVIEW

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity - - PowerPoint PPT Presentation

SemEval 2014 Task-3 Cross-Level Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly focused on similar types of lexical items Semantic Similarity What if we have different types of inputs? CLSS:


slide-1
SLIDE 1

Cross-Level Semantic Similarity

MultiJEDI ERC 259234

SemEval 2014 Task-3

slide-2
SLIDE 2

Semantic Similarity

slide-3
SLIDE 3

Semantic Similarity

Mostly focused on similar types of lexical items

slide-4
SLIDE 4

Semantic Similarity

What if we have different types of inputs?

slide-5
SLIDE 5

CLSS: Cross-Level Semantic Similarity

A new type of similarity task

slide-6
SLIDE 6

CLSS: Cross-Level Semantic Similarity

  • A new type of similarity task
slide-7
SLIDE 7

CLSS: Comparison Types

Paragraph to Sentence

slide-8
SLIDE 8

CLSS: Comparison Types

Sentence to Phrase Paragraph to Sentence

slide-9
SLIDE 9

CLSS: Comparison Types

Sentence to Phrase Paragraph to Sentence Phrase to Word

slide-10
SLIDE 10

CLSS: Comparison Types

Sentence to Phrase Paragraph to Sentence Word to Sense Phrase to Word

slide-11
SLIDE 11

Task Data

Training set Test set

4000 pairs in total

slide-12
SLIDE 12

Task Data

A wide range of domains and text styles

slide-13
SLIDE 13

word-to-sense pairs

Word to Sense

slide-14
SLIDE 14

word-to-sense pairs

Word to Sense

slide-15
SLIDE 15

word-to-sense pairs

Word to Sense

slide-16
SLIDE 16

word-to-sense pairs

Word to Sense

slide-17
SLIDE 17

Rating Scale

slide-18
SLIDE 18

Crafting an idealized similarity distribution

slide-19
SLIDE 19

Crafting an idealized similarity distribution

larger side

slide-20
SLIDE 20

Crafting an idealized similarity distribution

larger side

slide-21
SLIDE 21

Crafting an idealized similarity distribution

2 4 1 3 larger side

slide-22
SLIDE 22

Crafting an idealized similarity distribution

2 4 1 3 larger side

slide-23
SLIDE 23

Crafting an idealized similarity distribution

2 4 1 3 larger side smaller side

slide-24
SLIDE 24

Crafting an idealized similarity distribution

2 4 1 3 larger side smaller side

slide-25
SLIDE 25

Crafting an idealized similarity distribution

2 4 1 3

slide-26
SLIDE 26

Crafting an idealized similarity distribution

2 4 1 3

slide-27
SLIDE 27

Crafting an idealized similarity distribution

2 4 1 3

slide-28
SLIDE 28

Test and Training data IAA

Training (all) Training (unadjudicated) Test (all) Test (unadjudicated)

Krippendorff’s α

Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense

slide-29
SLIDE 29

The annotation procedure produces a balanced rating distribution

slide-30
SLIDE 30

Experimental Setup

  • The quick brown fox

The brown fox was quick The quick brown fox The brown foxes were quick

Baslines:

slide-31
SLIDE 31

Experimental Setup

  • The quick brown fox

The brown fox was quick The quick brown fox The brown foxes were quick

Baslines: Evaluation Measure:

slide-32
SLIDE 32

Number of participants

Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense

slide-33
SLIDE 33

1 2 3 4

Meerkat Mafia pw* SimCompass run1 ECNU run1 UNAL-NLP run2 SemantiKLUE run1 GST Baseline LCS Baseline Gold

paragraph-sentence sentence-phrase phrase-word word-sense

Top 5 Systems and Baselines

slide-34
SLIDE 34

1 2 3 4

Meerkat Mafia pw* SimCompass run1 ECNU run1 UNAL-NLP run2 SemantiKLUE run1 GST Baseline LCS Baseline Gold

paragraph-sentence sentence-phrase phrase-word word-sense

Top 5 Systems and Baselines

slide-35
SLIDE 35

0.75 1.5 2.25 3

Meerkat Mafia pw* SimCompass run1 ECNU run1 UNAL-NLP run2 SemantiKLUE run1 GST Baseline LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

slide-36
SLIDE 36

0.75 1.5 2.25 3

Meerkat Mafia pw* SimCompass run1 ECNU run1 UNAL-NLP run2 SemantiKLUE run1 GST Baseline LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

slide-37
SLIDE 37

0.75 1.5 2.25 3

Meerkat Mafia pw* SimCompass run1 ECNU run1 UNAL-NLP run2 SemantiKLUE run1 GST Baseline LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

slide-38
SLIDE 38

Correlation per genre

paragraph-to-sentence

slide-39
SLIDE 39

Correlation per genre

paragraph-to-sentence

slide-40
SLIDE 40

Correlation per genre

paragraph-to-sentence

slide-41
SLIDE 41

Correlation per genre

phrase-to-word

slide-42
SLIDE 42

Correlation per genre

phrase-to-word

slide-43
SLIDE 43

What makes the task difficult?

slide-44
SLIDE 44

Handling OOV words and novel usages

slide-45
SLIDE 45

Dealing with social media text

slide-46
SLIDE 46

CLSS: Cross-Level Semantic Similarity

Similarity of different types of lexical items High-quality dataset: 4000 pairs for four comparison types 38 systems from 19 teams

slide-47
SLIDE 47

Thank you!

David Jurgens Mohammad Taher Pilehvar Roberto Navigli

MultiJEDI ERC 259234