SLIDE 1 Integrating software and lingware in the OMNIA and AXiMAG projects, following a KISS approach
Achille Falaise 1/20
Laboratoire d'Informatique de Grenoble Grenoble Computer Science Laboratory
SLIDE 2 Motivations behind software integration for NLP
- For high level projects involving...
- ...many existing linguistic resources and linguistic software
(lingware), such as lexical databases, tokenisers, syntactic parsers, MT systems, etc.
- ...backend software (databases, web servers, etc.)
- ...end-user software (offline or online GUI)
- Many of them conceived by different teams, following
heterogeneous approaches, for various purposes...
- What & how do we integrate this at LIG-GETALP team ?
- 2 examples : OMNIA project & AXiMAG project
2/20
SLIDE 3 Outline
- The KISS approach and REST implementation
- OMNIA project (terminated)
- AXiMAG project (reimplementation in progress)
3/20
SLIDE 4 4/20
The KISS principle
- Stands for : Keep It Simple, Stupid
- Acronym created by war plane manufacturers
- A general design principle, applied to IT
4/20
SLIDE 5 KISS for integration : REST architecture
- Stands for : REpresentational State Transfert
- Integration of services, through HTTP protocol
- Client-server (independent of any GUI)
- Stateless (services have nothing to remember)
- Cachable (for better scalability)
- Layered (redirections permitted without restrictions)
- Service call via :
- Any web browser
- Any programming language supporting HTTP requests
- Command line with popular utilities like cURL
5/20
SLIDE 6
Lingware integration in the OMNIA project
6/20
SLIDE 7 OMNIA project (2008-2010) : multilingual information retrieval
- From companion texts (eg picture captions) in
several languages, to interlingual concepts
Multimodal analysis Press, Picasa, etc. Indexation Requests Relevant images Ontology
7/20
SLIDE 8 OMNIA modular architecture : from text to concepts
4 modules & 4 resources
Concepts Concept annotation Lemmatisation Ambig annotation Companion texts NL Requests Lemma-UW dictionnary UW-Concept Map Ontology Interlingual annotation Form-lemma dictionnary
8/20
SLIDE 9 Example companion text
9/20
AWA05 - 20020924 - BAGHDAD, IRAQ : Iraqi women sit under a portrait of Iraqi President Saddam Hussein in a waiting room in Baghdad's al-Mansur hospital 24 September 2002. Saddam Hussein is doggedly pursuing the development of weapons of mass destruction and will do his best to hide them from UN inspectors, the British government claimed in a 55-page dossier made public just hours before a special House
- f Commons debate on Iraq. Iraqi
Culture Minister Hamad Yussef Hammadi called the British allegations "baseless." EPA PHOTO AFPI AWAD AWAD
SLIDE 10
Lemmatisation process
Iraqi women sitting in a waiting room of Bagdad hospital a (DET) waiting room (NOUN) room (NOUN) 40 a (NOUN) 41 43 42 wait (VERB) waiting (NOUN) room (VERB) (Iraqi, NOUN) (woman, NOUN) (sit, VERB) (in, PREP) (a, DET) (waiting room, NOUN) (of, PREP) (Bagdad, NOUN) (hospital, NOUN) 10/20
Graph structure with ambiguities :
SLIDE 11 Interlingual lexicon : Universal Words (UW)
- Universal Words (UW)
- represent acceptions without ambiguities
- headword with meaning restrictions
- Examples :
– book(icl>thing) – book(icl>do, agt>human, obj>thing) – ikebana(icl>flower arrangement)
- 200.000 UW++ built from WordNet Synsets
11/20
SLIDE 12 room (VERB) room (VERB) waiting (NOUN) waiting (VERB) waiting room (NOUN)
Interlingual annotation process
a (DET)
waiting_room(icl>room>thing,equ>lounge) room(icl>dwell>do)
40 a (NOUN) 41 43 42
wait(icl>act>occur,obj>thing) waiting(icl>inactivity>thing) room(icl>opportunity>thing) wait(icl>work>do,obj>thing) room(icl>opportunity>thing) room(icl>position>thing) room(icl>area>thing)
12/20
SLIDE 13 room (VERB) room (VERB) waiting (NOUN) waiting (VERB) waiting room (NOUN)
Automatic disambiguation process
a (DET)
waiting_room(icl>room>thing,equ>lounge) room(icl>dwell>do)
40 a (NOUN) 41 43 42
wait(icl>act>occur,obj>thing) waiting(icl>inactivity>thing) room(icl>opportunity>thing) wait(icl>work>do,obj>thing) room(icl>opportunity>thing) room(icl>position>thing) room(icl>area>thing) 0.0020 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001
Ant algorithm (Schwab & Lafourcade 2007)
13/20
SLIDE 14 Ontology for conceptual annotation
- Ontology
- Domain dependant
- Concepts hierarchy
- Interlingual
- Alignment with interlingual lexicon
- Manual or automatic (Rouquet & Nguyen, 2009)
WOMEN PRESIDENT MINISTER HOSPITAL HOUSE PEOPLE POLITICS BUILDING RESIDENTIAL BUILDING
Ontology (excerpt) 14/20
SLIDE 15 Extracted concepts
Concept Score
RESIDENTIAL BUILDING 0.0002 PRESIDENT 0.0004 BUILDING 0.0004 HOUSE 0.0002 HOSPITAL 0.0002 POLITICS 0.0044 MINISTER 0.0040 WOMAN 0.0002 PEOPLE 0.0146
15/20
WOMEN PRESIDENT MINISTER HOSPITAL HOUSE PEOPLE POLITICS BUILDING RESIDENTIAL BUILDING
Ontology (excerpt)
SLIDE 16
OMNIA demo
16/20
SLIDE 17
Lingware & software integration in the AXiMAG project
17/20
SLIDE 18
AXiMAG version 2 project (2011) : website collaborative translation
18/20
Sectraw
Translation memories manager
AXiMAG
Website collaborative translation widget
Tradoh
MultiMT system
Systran Google Reverso Sistec
SLIDE 19
AXiMAG demo
19/20
SLIDE 20 Design issue in service-oriented architecture
- Request building & request processing
- May lead to DoS
- Means to avoid this issue
- Scope of the service
- Internal cache
20/20