Lexical sense alignment using weighted bipartite b -matching Sina - - PowerPoint PPT Presentation

lexical sense alignment using weighted bipartite b
SMART_READER_LITE
LIVE PREVIEW

Lexical sense alignment using weighted bipartite b -matching Sina - - PowerPoint PPT Presentation

Lexical sense alignment using weighted bipartite b -matching Sina Ahmadi (ULD) Supervisors: John McCrae, Mihael Arcan 16th Ph.D. day on April 8, 2018 Lexical resources 2 Lexical resources Expert-made Collaboratively-curated 3 Why


slide-1
SLIDE 1

Lexical sense alignment using weighted bipartite b-matching

Sina Ahmadi (ULD) Supervisors: John McCrae, Mihael Arcan

16th Ph.D. day on April 8, 2018

slide-2
SLIDE 2

Lexical resources

2

slide-3
SLIDE 3

Lexical resources

3

Expert-made Collaboratively-curated

slide-4
SLIDE 4

Why combining resources?

  • To improve word and concept coverage
  • e.g., named entities, new senses
  • To improve domain coverage
  • To improve multilinguality
  • Creating resources for new language pairs
  • To combine expert-made semantic relations
  • e.g., Hypernymy, meronymy, etc.

4

slide-5
SLIDE 5

A few applications

  • Semantic parsing
  • Word-sense disambiguation and entity linking

bat: or

  • Semantic role labeling

5

slide-6
SLIDE 6

Difficulty of resource alignment

6

Spring (n) 18 senses 6 senses

slide-7
SLIDE 7

7

How does our resource alignment work?

traditionally the first of the four seasons of the year in temperate regions

WordNet:spring Wiktionary:spring

the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

slide-8
SLIDE 8

8

Step 1. Alignment problem as a graph

traditionally the first of the four seasons of the year in temperate regions the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

U V U and V are disjoint and independent → bipartite

slide-9
SLIDE 9

Determine how similar two senses are by training a model using textual and definitional similarity features such as:

  • Word length ratio
  • Longest common subsequence
  • Jaccard measure
  • Word embeddings
  • Forward precision

9

Step 2. Extract similarity scores

* McCrae, John P., and Paul Buitelaar. "Linking Datasets Using SemanHc Textual Similarity." Cyberne)cs and Informa)on Technologies 18.1 (2018): 109-123.

*

slide-10
SLIDE 10

10

traditionally the first of the four seasons of the year in temperate regions the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

U V

0.00187 0.98951 0.84391 0.01021 0.38672 0.00021

Step 2. Extract similarity scores

Weighted bipartite graph

slide-11
SLIDE 11

Step 3. Graph matching

slide-12
SLIDE 12

12

traditionally the first of the four seasons of the year in temperate regions the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

U V

0.00187 0.98951 0.84391 0.01021 0.38672 0.00021

Exhaustive matching

→ Use a threshold

  • High precision, low recall
  • Difficult to find optimal threshold
  • Uniform matching

0.98951

slide-13
SLIDE 13

13

traditionally the first of the four seasons of the year in temperate regions the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

U V

0.00091 0.00101 0.00363 0.89191 0.00362 0.02012

Exact matching

→ 1-to-1 links maximising overall weight

  • High precision, not so high recall
  • Bijective mapping restriction

0.00187 0.98951 0.84391 0.01021 0.38672 0.00021 0.98951 0.89191

slide-14
SLIDE 14

14

traditionally the first of the four seasons of the year in temperate regions the season of growth a leap; a bound; a jump the elasticity of something that can be stretched and returns to its original length a place where water or oil emerges from the ground

a natural flow of ground water

a light, self-propelled movement upwards or forwards

elasHc power or force.

the property of a body of springing to its original form aJer being compressed, stretched, etc. the Hme of growth and progress; early porHon; first stage.

U V

0.00091 0.00101 0.00363 0.89191 0.00362 0.02012

Weighted bipartite b-matching (WBbM)

→ maximising overall weight + node capacity [l, b]

  • High precision, high recall
  • Efficient in linking polysemous items
  • Still difficult to tune the parameters

0.00187 0.98951 0.84391 0.01021 0.38672 0.00021 0.98951 0.89191

l = 0 b = 1 l = 0 b = 1 l = 1 b = 2 l = 1 b = 2

0.84391 0.98951 0.89191

slide-15
SLIDE 15

15

Our resource alignment mechanism: schema

slide-16
SLIDE 16

Alignment Experiments: Datasets

16

WordNet synsets manually mapped to their corresponding concepts *

* Matuschek, Michael, and Iryna Gurevych. "Dijkstra-wsa: A graph-based approach to word sense alignment." Transac)ons of the Associa)on for Computa)onal Linguis)cs 1 (2013): 151-164.

slide-17
SLIDE 17

Experiments: WordNet-Wiktionary alignment

Previous work: Our current method:

17

slide-18
SLIDE 18

Conclusion

  • Graph matching algorithms can be efficiently applied to lexical

alignment problems

  • WBbM
  • includes more possible linking combinations by defining capacity
  • efficient in lexical alignment providing high precision and recall
  • difficult to find optimal parameters
  • highly dependent on the textual and definitional similarities

18

slide-19
SLIDE 19

Future directions

  • Exploring link prediction methods for lexical alignment
  • Extend our researches to multi-lingual resources
  • Using graph neural networks (GNNs)

19