Papillon Project Mathieu Mangeot & David Thevenin Work done at - - PowerPoint PPT Presentation

papillon project
SMART_READER_LITE
LIVE PREVIEW

Papillon Project Mathieu Mangeot & David Thevenin Work done at - - PowerPoint PPT Presentation

Online Generic Editing of Heterogeneous Dictionary Entries in Papillon Project Mathieu Mangeot & David Thevenin Work done at NII, Tokyo, Japan Now looking for a position... My Motivation Dictionaries are a Key Element of almost every


slide-1
SLIDE 1

Online Generic Editing of Heterogeneous Dictionary Entries in Papillon Project

Mathieu Mangeot & David Thevenin Work done at NII, Tokyo, Japan Now looking for a position...

slide-2
SLIDE 2

My Motivation

  • Dictionaries are a Key Element of almost every NLP

System

  • But Construction Costs are Heavy
  • => Lowering the Costs by Facilitating the

Construction & Maintenance:

  • Building Dedicated Environments
  • Mutualizing the Resources by Reusing Existing

Ones

  • Development by

Voluntary Contributors

  • Resulting Data Publicly Available
slide-3
SLIDE 3

Outline

  • The Situation: Manipulation of XML Dictionaries

with Heterogeneous Entry Structures

  • The Problem: How to Edit them Online?
  • Our Solution: Using an HMI Tool
  • 2 Examples: Papillon & GDEF Dicts
  • Conclusion and Future Work
slide-4
SLIDE 4

Papillon Platform

http://www.papillon-dictionary.org

Import Dict GDEF WaDoKu FeM JMDict SAIKAM DiCo Cedict Dict Papillon VietDict Ding

Online Server

User

Browse

slide-5
SLIDE 5

Papillon Platform

Import Dict GDEF WaDoKu FeM JMDict SAIKAM DiCo Cedict Dict Papillon VietDict Ding

Online Server

Checks

Specialist Contributor

Edits

slide-6
SLIDE 6

Outline

  • The Situation: Manipulation of XML Dictionaries

with Heterogeneous Entry Structures

  • The Problem: How to Edit them Online?
  • Our Solution: Using an HMI Tool
  • 2 Examples: Papillon & GDEF Dicts
  • Conclusion and Future Work
slide-7
SLIDE 7

Requirements for the Edition

  • Editor Available Online
  • Heterogeneous Entry Structures
  • Adaptative Interfaces
  • To the User (Neophyth, Specialist)
  • To the Platform (PDA, Workstation)
slide-8
SLIDE 8

The Best: Ad Hoc Editor

slide-9
SLIDE 9

Inconvenients

  • Ad Hoc for a Particular Structure
  • Must be Reimplemented if the Entry

Structure Changes

  • Local and Platform Dependent
  • Users Cannot Contribute Online
slide-10
SLIDE 10

Distributed & Democratic

Data base RTF Files Conversion with a LISP Program Distribution to the Lexicographers

slide-11
SLIDE 11

With Word™!

slide-12
SLIDE 12

Inconvenients

  • Not Usable for Complex Structures
  • One Type of Information Per Line
  • No Complete Syntax Checking
  • Real Time Edition Not Possible
  • Delay Necessary for Conversion &

Transport

slide-13
SLIDE 13

Online: with HTML

slide-14
SLIDE 14

Inconvenients

  • Not Dynamically Adaptable
  • Need to Write One Interface for Each

Entry Structure

  • Lack of Interactors
  • Only Buttons, Text Boxes, Check Boxes &

Pop up Menus

slide-15
SLIDE 15

Outline

  • The Situation: Manipulation of XML Dictionaries

with Heterogeneous Entry Structures

  • The Problem: How to Edit them Online?
  • Our Solution: Using an HMI Tool
  • 2 Examples: Papillon & GDEF Dicts
  • Conclusion and Future Work
slide-16
SLIDE 16

Our Solution

  • None of the Previous Solutions Satisfy our

Requirements

  • An Idea
  • Using HMI Techniques & Tools for

Automatically Generating Interfaces

  • Generation Based on the Data Structure and

the User Profile

slide-17
SLIDE 17

ArtStudio: a Multitarget Generation Framework

  • Author: David Thevenin

Task Concept Instance Abstract UI Concrete UI Concrete UI Final UI Final UI Platform User Environment Platform User Environment Initial description Transit description Final description

slide-18
SLIDE 18

Our Implementation

Concept Model: XML Schema Instances Model CUI Model

Necessary files: Generated UIs:

Web/HTML Mobile/WML Automatic Generator

slide-19
SLIDE 19

A Simple Entry

entry head word pos example example scientifique adj journées scientifiques journal scientifique Legend: XML Element Link to a child element Link to the element value textual content

slide-20
SLIDE 20

Concepts Model: an XML Schema

C_entry C_head word C_pos C_list examples C_example I_head word PopUp Menu I_pos example1 example2 example3

List

  • I_list

examples I_entry TextBox I_ examples Legend: Concept Instance Link to the interactor used by the concept Link to a child concept Link to the instance TextBox

slide-21
SLIDE 21

Instances Model

I_head word I_pos I_examples list I_entry I_example I_example scientifique adj journées scientifiques journal scientifique <entry><hv>scientifique</hv> <pos>adj</pos> <ex>journées scientifiques</ex> <ex>journal scientifique</ex></entry> <ex>journées scientifiques</ex> <ex>journal scientifique</ex> Legend: Instance Link to the instance value Link to a child instance

slide-22
SLIDE 22

CUI Model

  • XML Document
  • Describes the Graphic User Interface
  • Interactors and their Position
  • Target-Dependent
  • One Model for each Target:
  • Edition,

Visualisation, Mobile Phone

slide-23
SLIDE 23

Outline

  • The Situation: Manipulation of XML Dictionaries

with Heterogeneous Entry Structures

  • The Problem: How to Edit them Online?
  • Our Solution: Using an HMI Tool
  • 2 Examples: Papillon & GDEF Dicts
  • Conclusion and Future Work
slide-24
SLIDE 24

Papillon Dictionary

  • Multilingual Dictionary with a Pivot Structure
  • Monolingual Entries linked to a Pivot

Volume

  • Microstructure based on the Meaning-T

ext Theory

  • Very Complex: semantic formula, gvt pattern,

lexical functions, etc.

slide-25
SLIDE 25

Edition Interface

slide-26
SLIDE 26

Other Views

Consultation: Mobile Phone:

slide-27
SLIDE 27

GDEF Dictionary

  • Bilingual Estonian-French Dictionary
  • Project Leader: Antoine Chalvin INALCO,

Paris

  • Microstructure based on the Lemma
slide-28
SLIDE 28

Edition Interface

slide-29
SLIDE 29

Outline

  • The Situation: Manipulation of XML Dictionaries

with Heterogeneous Entry Structures

  • The Problem: How to Edit them Online?
  • Our Solution: Using an HMI Tool
  • 2 Examples: Papillon & GDEF Dicts
  • Conclusion and Future Work
slide-30
SLIDE 30

Conclusion

  • Innovative Solution
  • Generic: Multi-Dictionaries
  • Efficient: already 152 entries for GDEF Dict

(2 people, 2 months)

  • Multitarget: Edition, Consultation, Mobile

Phone

  • Multipurpose: can be adapted for other type
  • f data
slide-31
SLIDE 31

Future Work

  • To Find a Position!
  • Implementing more Features of the XML

Schemata:

  • Basic Types: boolean, date, etc.
  • Complex Structures: choice, etc.
  • Automatizing the Process:
  • Generation of the Interface Model from

the XML Schema