model[NL]generation: Natural Language Model Extraction 27.10.2013 - - PowerPoint PPT Presentation

model nl generation natural language model extraction
SMART_READER_LITE
LIVE PREVIEW

model[NL]generation: Natural Language Model Extraction 27.10.2013 - - PowerPoint PPT Presentation

Lehrstuhl fr Angewandte Informatik IV Prof. Dr.-Ing. Stefan Jablonski Uni nivers versity o Bayreu reuth th f The 13th Workshop on Domain-Specific Modeling model[NL]generation: Natural Language Model Extraction 27.10.2013 Lars


slide-1
SLIDE 1

Lehrstuhl für Angewandte Informatik IV

  • Prof. Dr.-Ing. Stefan Jablonski

Lars Ackermann (M. Sc.) E-Mail: lars.ackermann@uni-bayreuth.de

model[NL]generation: Natural Language Model Extraction

27.10.2013 Uni nivers versity Bayreu reuth th

The 13th Workshop on Domain-Specific Modeling

  • f
slide-2
SLIDE 2

Why domain-specific modeling is useful

04.12.2013 2013 (C) Databases and Information Systems Group | Lars Ackermann | 2

George: Condi! Nice to see you. What's happening? Condi: Sir, I have the report here about the new leader of China. […] Hu is the new leader of China. George: That's what I want to know. Condi: That's what I'm telling you. George: That's what I'm asking you. Who is the new leader of China? Condi: Yes. […] George: Will you or will you not tell me the name of the new leader

  • f China?

Condi: Yes, sir. George: Yassir? Yassir Arafat is in China? I thought he was in the Middle East. […]

Source: James Sherman

slide-3
SLIDE 3

Modeling language Language

reconstruction

Formal model

Larger Scope

04.12.2013 2013 (C) Databases and Information Systems Group | Lars Ackermann | 3

Information Extraction Natural Language Processing Informal, textual

description

M0 M2 M1

Information about source- and target language and speaker

Model reconstruction Formal description of a model instance

Information about modeling patterns Ideally, this is achieved the same way as reconstructing a model There is no jump involved, levels are reconstructed from bottom to top

slide-4
SLIDE 4

Not only in IT (but notably there) …

#1

Customer

Subjectively created target model

Docu Docu Docu

Up to now Getting flexibly, intuitively and interactively towards the target model using natural language.

Docu Docu Docu 04.12.2013 | 4

Customer

Systematically generated target model model[NL]generation

model[NL]generation

Docu

What the programer implemented What the customer explained What the project manager understood How the analyst designed it What the customer really needed 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-5
SLIDE 5

model[NL]generation: Natural Language Model Extraction

model[NL]generation

Model Type Library

ER UML Processes

1

Flexible Choosing the (domain specific) meta model

Sample Sentences “The manager notifies the customer.” “After that, employees advise their colleagues.”

2

Intuitive Training a system for processing natural language using natural language

3

Input

Interactive Extracting target model

Feedback

04.12.2013 | 5 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-6
SLIDE 6

Many users, even more formulations – one Activity: We need flexible detection rules

04.12.2013 | 6

Activity

name : contacts

“The manager notifies the customer.” “The customer is contacted by the manager.” “The manager is responsible for the contact to the customer.”

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-7
SLIDE 7

First attempt of detection rule definition: Plain Syntax

  • Example:

„The manager notifies the customer .“

notifies

? ? Noun Design Strategy

[Rumbaugh1991]

Noun Verb Noun Deter- miner Deter- miner

NP VP NP

Subject Object Predicate

  • Great Basis:

Such grammatical parsers already exist!

04.12.2013 | 7

S

2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-8
SLIDE 8
  • But …

“The managers notify the customer.” “The manager notifies the customers.“ “The managers notify many customers.“ “The MN notifies the customer.” “Managers notify the customer.“ “Max notifies the customer.“ “She notifies many customers.“ “The alarm notifies the customer.“

Everything is fine so far …

… but only so far. Syntax is ambiguous! notifies/VBZ is/VBZ POS Tags S  NP VP VP  notifies NP NP  DT NN S  NP VP VP  notifies NP VP  notify NP VP  … VP  … VP  … NP  DT NN NP  DT NNS NP  … NP  … …  … “The alarm notifies the customer.“ Actor = Person?! “The MN notifies the customer.“ MN ??

  • Example 1: “The manager notifies the customer.”
  • Example 2: “The manager is the executive.”
  • Adjust detection rules to „notifies“

 Lexicalization of production rules

04.12.2013 | 8 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-9
SLIDE 9
  • Requirement 1: Detection ability

Creating detection rules – 1

<interact>

The manager notifies the customer

  • Users

meet an administrator Process Process The manager is the executive Actor <be>

Object Subject

  • Requirement 2: Disambiguation ability

04.12.2013 | 9

Process detection rule ?

<PERSON> <interact>

?

<PERSON>

2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-10
SLIDE 10

Remaining issues:

  • Disambiguation
  • Specialization (Domain,

User, …)

Creating detection rules – 2

The manager notifies the customer

NP VP NP

S

DT NN VBZ DT NN

Automatic feature extraction and abstraction Syntactic basis

  • Syntax tree for given samples
  • Reduce them to be more general

1

04.12.2013 | 10 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-11
SLIDE 11

NP VP NP

Creating detection rules – 3

The manager notifies the customer “notifies the customer“ “meets an administrator“ “informs the CEO“ “teach all pupils“ “The manager“ “All managers“ “The account manager“ “All assembly managers“

VP NP

S

Object of interest

Select the sentence parts which are the most relevant [Collins1999]

2

(notifies) (customer) (manager)

04.12.2013 | 11 2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-12
SLIDE 12

Creating detection rules – 4

NP VP NP

S

(is) (executive) (manager)

NP VP NP

S

(notifies) (customer) (manager) 2 1 3 1 2 3

“The manager notifies the customer.” “Users meet an administrator.” “The manager is the executive.”

WNCat <manager> TypDep nsubj_dep det_gov the 2 WNCat <be> TypDep cop_dep 3 WNCat <executive> TypDep det_gov the nsubj_gov cop_gov 1 2 WNCat <person> TypDep nsubj_dep WNCat <interact> TypDep nsubj_gov dobj_gov WNCat <person> TypDep dobj_dep 2 1 3 2

04.12.2013 | 12

Annotation

Automatically add disambiguating and specializing information

3

2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-13
SLIDE 13

Mapping: Feature key paths

04.12.2013 2013 (C) Databases and Information Systems Group | Lars Ackermann | 13

advises

Concat(Lemma(S_VP), S_NP)

1

colleague

Concat(Lemma(advises), employee)

2

MappingElement

Activity

name = advise employee

“advise employee”

3

Schema Entity name = car Attribute

name = color

Attribute

name = speed

Entity name = car Attribute name = color Entity name = car Attribute name = speed

Process meta model element

<<instance of>>

slide-14
SLIDE 14

Vision

04.12.2013 | 14

Linguistic fingerprint Domain-specfic phrases

Source

Using linguistic knowledge

Flexible linguistic knowledge

ER UML Processes Formalized Knowledge

Dynamic Domains

Generation Extraction

Bi-directional transformation

2013 (C) Databases and Information Systems Group | Lars Ackermann

slide-15
SLIDE 15

Main Literature:

  • [Collins1999]

Michael Collins. “Head-Driven Statistical Models for Natural Language Parsing”.

  • Diss. University of Pennsylvania, 1999.
  • [JurMart2008]

Dan Jurafsky und James H. Martin. Speech and Language Processing. 2nd

  • edition. Prentice Hall International, Apr. 2008. isbn: 9780135041963.
  • [Rumbaugh1991] James Rumbaugh, et al.: Object-Oriented Modeling and Design, Prentice

Hall 1990, ISBN 0-13-629841-9 Further Literature:

  • ilastic.com, An open-source framework to query, integrate and manipulate any type of data in English. Checked 20.10.2012.
  • Friedrich, Fabian. Automated Generation of Business Process Models from Natural Language Input. Master Thesis. Berlin: The School
  • f Business und Economics of the Humboldt-Universitảt zu Berlin, 2010.
  • Avik Sinha, Amit Paradkar, Palani Kumanan u. a. An Analysis Engine for Dependable Elicitation of Natural Language Use Case

Description and Its Application to Industrial Use Cases. Techn. Ber. RC24712 (W0812-106). IBM Research Report. IBM Research Division, 18. Dez. 2008.

Images.

#1: http://stoertebeker.kkb-clan.de/pics/softwareentwicklung.jpg, 20.10.2013 #2: http://invincibles.in/tag/marvin/, 20.10.2013

References

04.12.2013 2013 (C) Databases and Information Systems Group | Lars Ackermann | 15

slide-16
SLIDE 16

Thank you for your attention

04.12.2013 2013 (C) Databases and Information Systems Group | Lars Ackermann | 16

#2