Integrating software and lingware in the OMNIA and AXiMAG projects, - - PowerPoint PPT Presentation

integrating software and lingware in the omnia and aximag
SMART_READER_LITE
LIVE PREVIEW

Integrating software and lingware in the OMNIA and AXiMAG projects, - - PowerPoint PPT Presentation

Integrating software and lingware in the OMNIA and AXiMAG projects, following a KISS approach Achille Falaise Laboratoire d'Informatique de Grenoble Grenoble Computer Science Laboratory 1/20 Motivations behind software integration for NLP


slide-1
SLIDE 1

Integrating software and lingware in the OMNIA and AXiMAG projects, following a KISS approach

Achille Falaise 1/20

Laboratoire d'Informatique de Grenoble Grenoble Computer Science Laboratory

slide-2
SLIDE 2

Motivations behind software integration for NLP

  • For high level projects involving...
  • ...many existing linguistic resources and linguistic software

(lingware), such as lexical databases, tokenisers, syntactic parsers, MT systems, etc.

  • ...backend software (databases, web servers, etc.)
  • ...end-user software (offline or online GUI)
  • Many of them conceived by different teams, following

heterogeneous approaches, for various purposes...

  • What & how do we integrate this at LIG-GETALP team ?
  • 2 examples : OMNIA project & AXiMAG project

2/20

slide-3
SLIDE 3

Outline

  • The KISS approach and REST implementation
  • OMNIA project (terminated)
  • AXiMAG project (reimplementation in progress)

3/20

slide-4
SLIDE 4

4/20

The KISS principle

  • Stands for : Keep It Simple, Stupid
  • Acronym created by war plane manufacturers
  • A general design principle, applied to IT

4/20

slide-5
SLIDE 5

KISS for integration : REST architecture

  • Stands for : REpresentational State Transfert
  • Integration of services, through HTTP protocol
  • Client-server (independent of any GUI)
  • Stateless (services have nothing to remember)
  • Cachable (for better scalability)
  • Layered (redirections permitted without restrictions)
  • Service call via :
  • Any web browser
  • Any programming language supporting HTTP requests
  • Command line with popular utilities like cURL

5/20

slide-6
SLIDE 6

Lingware integration in the OMNIA project

6/20

slide-7
SLIDE 7

OMNIA project (2008-2010) : multilingual information retrieval

  • From companion texts (eg picture captions) in

several languages, to interlingual concepts

  • Ontology-based indexing

Multimodal analysis Press, Picasa, etc. Indexation Requests Relevant images Ontology

7/20

slide-8
SLIDE 8

OMNIA modular architecture : from text to concepts

4 modules & 4 resources

Concepts Concept annotation Lemmatisation Ambig annotation Companion texts NL Requests Lemma-UW dictionnary UW-Concept Map Ontology Interlingual annotation Form-lemma dictionnary

8/20

slide-9
SLIDE 9

Example companion text

9/20

AWA05 - 20020924 - BAGHDAD, IRAQ : Iraqi women sit under a portrait of Iraqi President Saddam Hussein in a waiting room in Baghdad's al-Mansur hospital 24 September 2002. Saddam Hussein is doggedly pursuing the development of weapons of mass destruction and will do his best to hide them from UN inspectors, the British government claimed in a 55-page dossier made public just hours before a special House

  • f Commons debate on Iraq. Iraqi

Culture Minister Hamad Yussef Hammadi called the British allegations "baseless." EPA PHOTO AFPI AWAD AWAD

slide-10
SLIDE 10

Lemmatisation process

Iraqi women sitting in a waiting room of Bagdad hospital a (DET) waiting room (NOUN) room (NOUN) 40 a (NOUN) 41 43 42 wait (VERB) waiting (NOUN) room (VERB) (Iraqi, NOUN) (woman, NOUN) (sit, VERB) (in, PREP) (a, DET) (waiting room, NOUN) (of, PREP) (Bagdad, NOUN) (hospital, NOUN) 10/20

Graph structure with ambiguities :

slide-11
SLIDE 11

Interlingual lexicon : Universal Words (UW)

  • Universal Words (UW)
  • represent acceptions without ambiguities
  • headword with meaning restrictions
  • Examples :

– book(icl>thing) – book(icl>do, agt>human, obj>thing) – ikebana(icl>flower arrangement)

  • 200.000 UW++ built from WordNet Synsets

11/20

slide-12
SLIDE 12

room (VERB) room (VERB) waiting (NOUN) waiting (VERB) waiting room (NOUN)

Interlingual annotation process

a (DET)

waiting_room(icl>room>thing,equ>lounge) room(icl>dwell>do)

40 a (NOUN) 41 43 42

wait(icl>act>occur,obj>thing) waiting(icl>inactivity>thing) room(icl>opportunity>thing) wait(icl>work>do,obj>thing) room(icl>opportunity>thing) room(icl>position>thing) room(icl>area>thing)

12/20

slide-13
SLIDE 13

room (VERB) room (VERB) waiting (NOUN) waiting (VERB) waiting room (NOUN)

Automatic disambiguation process

a (DET)

waiting_room(icl>room>thing,equ>lounge) room(icl>dwell>do)

40 a (NOUN) 41 43 42

wait(icl>act>occur,obj>thing) waiting(icl>inactivity>thing) room(icl>opportunity>thing) wait(icl>work>do,obj>thing) room(icl>opportunity>thing) room(icl>position>thing) room(icl>area>thing) 0.0020 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001

Ant algorithm (Schwab & Lafourcade 2007)

13/20

slide-14
SLIDE 14

Ontology for conceptual annotation

  • Ontology
  • Domain dependant
  • Concepts hierarchy
  • Interlingual
  • Alignment with interlingual lexicon
  • Manual or automatic (Rouquet & Nguyen, 2009)

WOMEN PRESIDENT MINISTER HOSPITAL HOUSE PEOPLE POLITICS BUILDING RESIDENTIAL BUILDING

Ontology (excerpt) 14/20

slide-15
SLIDE 15

Extracted concepts

  • Output for whole text

Concept Score

RESIDENTIAL BUILDING 0.0002 PRESIDENT 0.0004 BUILDING 0.0004 HOUSE 0.0002 HOSPITAL 0.0002 POLITICS 0.0044 MINISTER 0.0040 WOMAN 0.0002 PEOPLE 0.0146

15/20

WOMEN PRESIDENT MINISTER HOSPITAL HOUSE PEOPLE POLITICS BUILDING RESIDENTIAL BUILDING

Ontology (excerpt)

slide-16
SLIDE 16

OMNIA demo

16/20

slide-17
SLIDE 17

Lingware & software integration in the AXiMAG project

17/20

slide-18
SLIDE 18

AXiMAG version 2 project (2011) : website collaborative translation

18/20

Sectraw

Translation memories manager

AXiMAG

Website collaborative translation widget

Tradoh

MultiMT system

Systran Google Reverso Sistec

slide-19
SLIDE 19

AXiMAG demo

19/20

slide-20
SLIDE 20

Design issue in service-oriented architecture

  • Request building & request processing
  • May lead to DoS
  • Means to avoid this issue
  • Scope of the service
  • Internal cache

20/20