Supporting Collaborative Modeling via Natural Language Processing - - PowerPoint PPT Presentation

supporting collaborative modeling via natural language
SMART_READER_LITE
LIVE PREVIEW

Supporting Collaborative Modeling via Natural Language Processing - - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . Supporting Collaborative Modeling via Natural Language Processing Fatma Baak Aydemir 1 Fabiano Dalpiaz 2 1 Boazii University 2 Utrecht University . . . . . . . . . . . . . . . .


slide-1
SLIDE 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Supporting Collaborative Modeling via Natural Language Processing

Fatma Başak Aydemir1 Fabiano Dalpiaz2

1Boğaziçi University 2Utrecht University

39th International Conference on Conceptual Modeling

slide-2
SLIDE 2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Collaborative Modeling

Various

  • Experts
  • Modeling Languages
  • Locations
  • Time Zones

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 2 19

slide-3
SLIDE 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Concept Suggestion Service

Modeler 1 Modeler 2 Concept suggester

  • O1. commitModel(m1)
  • O2. analyzeCommit(m1)
  • O1. commitModel(m2)
  • O2. analyzeCommit(m2)
  • O3. requestSuggestions(m1)
  • O4. analyzeModels(m1, {m2})
  • O5. suggest(s1, m1)
  • O6. feedback(f1, s1)
  • O3. requestSuggestions(m2)
  • O4. analyzeModels(m2, {m1})
  • O5. suggest(s2, m2)
  • O6. feedback(f2, s2)

Modeler 1 Modeler 2 Concept suggester Modeler 1 Modeler 2 Concept suggester Modeler 2 Modeler 1 Concept Suggester

A web service integrated to the modeling environments

  • Ignores the meta-models
  • Analyzes the labels
  • Keeps track of the modelled

concepts

  • Suggests missing concepts to

modelers

  • Hides model details
  • Completeness
  • Common vocabulary

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 3 19

slide-4
SLIDE 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Service Operations

  • O2. Extract noun

phrases (terms) Model 1 Model 1 terms

{t1, ..., tn}

Missing terms

{..., tj, ... }

  • O4a. Identify

missing terms Project terms

{pt1, ..., ptq}

  • O1. Model 1

committed Candidate concepts

  • O3. Suggestions

requested

Standard Activity Catching End Data

  • bject

Messages Control flow Data flow

Legend

  • O4b. Match terms

with concepts Domain model

{c1, ..., cn} F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 4 19

slide-5
SLIDE 5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matching Heuristics

  • Exact match
  • Sub-string match
  • Similarity
  • Relatedness

Possible matches for “thesis” Thesis Thesis Project Graduation project Coordinator

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 5 19

slide-6
SLIDE 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matching Heuristics

  • Exact match
  • Sub-string match
  • Similarity
  • Relatedness

Possible matches for “thesis”

  • Thesis

Thesis Project Graduation project Coordinator

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 5 19

slide-7
SLIDE 7

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matching Heuristics

  • Exact match
  • Sub-string match
  • Similarity
  • Relatedness

Possible matches for “thesis”

  • Thesis
  • Thesis Project

Graduation project Coordinator

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 5 19

slide-8
SLIDE 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matching Heuristics

  • Exact match
  • Sub-string match
  • Similarity
  • Relatedness

Possible matches for “thesis”

  • Thesis
  • Thesis Project
  • Graduation project

Coordinator

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 5 19

slide-9
SLIDE 9

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matching Heuristics

  • Exact match
  • Sub-string match
  • Similarity
  • Relatedness

Possible matches for “thesis”

  • Thesis
  • Thesis Project
  • Graduation project
  • Coordinator

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 5 19

slide-10
SLIDE 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Suggestion Heuristics

  • Parent
  • Child
  • Sibling

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 6 19

slide-11
SLIDE 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Suggestion Heuristics

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 7 19

slide-12
SLIDE 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Filtering Heuristics

  • Fixed number
  • User feedback
  • Frequency
  • Limiting matches per missing item

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 8 19

slide-13
SLIDE 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Similarity of Compound Nouns

  • Compound nouns for many concepts
  • Used in heuristics
  • Add detail to the models
  • Similarity check for common terminology
  • Explore the domain model
  • Two algorithms to calculate the similarity of a pair of compound nouns
  • WordNet and Word2Vecbased

Domain model based

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 9 19

slide-14
SLIDE 14

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Similarity of Compound Nouns

  • Compound nouns for many concepts
  • Used in heuristics
  • Add detail to the models
  • Similarity check for common terminology
  • Explore the domain model
  • Two algorithms to calculate the similarity of a pair of compound nouns
  • WordNet and Word2Vecbased
  • Domain model based

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 9 19

slide-15
SLIDE 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

WordNet and Word2Vec based Similarity

  • Get the Word2Vec similarity scores of noun pairs of compound words
  • If the score is higher than a threshold check if they are synonyms using WordNet
  • Set the score to 1 for synonyms, leave as is otherwise
  • The similarity of the compounds is a weighted average of the pairs

Thesis Project, Graduation Project γ · sim(thesis, graduation) + δ · sim(project, project)+ ϵ · sim(thesis, project) + κ · sim(project, graduation) (1)

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 10 19

slide-16
SLIDE 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Domain Model based Similarity

  • Based on how well the compounds are matched in the domain model
  • Individual matching score for each compound is calculated
  • The similarity of the two term is the average of their scores

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 11 19

slide-17
SLIDE 17

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Experimental Setup to Detect Similarity

  • Used 20 pairs of 2-word compound nouns
  • Gold standard: surveyed people to assess the similarity on a 5-point Likert type

scale

  • Compared the results with Bert-web and spaCy similarity

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 12 19

slide-18
SLIDE 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Pairs Used

Pair ID First compound Second compound Cat. Pair ID First compound Second compound Cat. P1 thesis project project facilitator 1 P11 company supervisor second supervisor 3 P2 MBI student student administration 1 P12 fjrst supervisor second supervisor 3 P3 MBI thesis thesis topic 1 P13 computing science information science 3 P4 graduation project project idea 1 P14 project proposal short proposal 3 P5 literature review relevant literature 1 P15

  • ffjcial ceremony

graduation ceremony 3 P6 fjrst phase fjrst supervisor 2 P16 scientifjc paper

  • ffjcial ceremony

4 P7 second phase second presentation 2 P17 Google calendar MBI colloquium 4 P8 graduation ceremony graduation supervisor 2 P18 company supervisor project facilitator 4 P9 MBI thesis MBI colloquium 2 P19 department member participation token 4 P10 thesis topic thesis report 2 P20 research question literature review 4

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 13 19

slide-19
SLIDE 19

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Most and Least Similar Pairs

Word embeddings + WordNet spaCy Domain model Gold standard fjrst supervisor - second supervisor .71 fjrst supervisor - sec-

  • nd supervisor

.95 computing science - in- formation science .39 offjcial ceremony

  • graduation ceremony

.69 thesis topic - thesis re- port .60 offjcial ceremony

  • graduation ceremony

.79 MBI thesis - thesis topic .38 project proposal

  • short proposal

.62 second phase - second presentation .59 project proposal

  • short proposal

.78 MBI thesis - MBI collo- quium .38 company supervisor - project facilitator .56 MBI thesis - thesis topic .27 MBI thesis

  • thesis

topic .53 fjrst phase - fjrst super- visor .00 thesis project - project facilitator .12 department member

  • participation token

.26 company supervisor - project facilitator .51 second phase - second presentation .00 fjrst phase - fjrst su- pervisor .12 MBI student - student administration .22 department member - participation token .41 graduation ceremony - graduation supervisor .00 MBI student - student administration .09

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 14 19

slide-20
SLIDE 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Results i

Category / Heuristic Cross First Second None w2v+wnet spaCy domain Bert-Web gold w2v+wnet spaCy domain Bert-Web gold w2v+wnet spaCy domain Bert-Web gold w2v+wnet spaCy domain Bert-Web gold 0.0 0.2 0.4 0.6 0.8 1.0 Value

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 15 19

slide-21
SLIDE 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Results ii

  • spaCy has the highest Pearson correlation with the gold standard
  • Domain model based approach has the least Euclidian distance to the gold

standard

  • Specifjc heuristics resembles the gold standard the most
  • Bert has promising results as well

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 16 19

slide-22
SLIDE 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Discussion

  • NLP can be used support collaborative modeling activities
  • Conceptual modeling domain has its own challenges in terms of NLP
  • We need more data and empirical validation for both the service and the similarity

algorithms

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 17 19

slide-23
SLIDE 23

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Conclusions

  • A lightweight NLP-powered service to facilitate collaborative modeling
  • Heuristics to detect the concepts that are modelled, and to provide suggestions

from a domain model for those concepts that are not modelled in a multi-model and multi-modeler collaborative modelling setting

  • This work focuses on the challenge of identifying similar compound nouns
  • Our domain model based algorithm performs well, but we need further validation

to be conclusive

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 18 19

slide-24
SLIDE 24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contact us!

 basak.aydemir@boun.edu.tr  @aydemirfb  f.dalpiaz@uu.nl  @FabianoDalpiaz

Thank you for your attention!

F.B. Aydemir Supporting Collaborative Modeling via Natural Language Processing ER 2020 19 19