A Pattern-Based Machine Translation System Yakushite Net MT Engine - - PowerPoint PPT Presentation

a pattern based machine translation system yakushite net
SMART_READER_LITE
LIVE PREVIEW

A Pattern-Based Machine Translation System Yakushite Net MT Engine - - PowerPoint PPT Presentation

A Pattern-Based Machine Translation System Yakushite Net MT Engine Miki Sasaki and Toshiki Murata Oki Electric Industry Co., Ltd. 2-5-7 Hommachi, Chuo-ku, Osaka 541-0053, JAPAN {sasaki234, murata656}@oki.com Machine Translation by OKI inc.


slide-1
SLIDE 1

A Pattern-Based Machine Translation System — Yakushite Net MT Engine

Miki Sasaki and Toshiki Murata

Oki Electric Industry Co., Ltd. 2-5-7 Hommachi, Chuo-ku, Osaka 541-0053, JAPAN {sasaki234, murata656}@oki.com

slide-2
SLIDE 2

2

Machine Translation by OKI inc.

 Rule-based MT -> Pattern-based MT

 Rule-based MT (PENSEE 1980s ~ 1990s)  Pattern-based MT (implemented with Java 1997 ~ )  Collaborative translation environment

(Yakusite Net 2001 ~ )

 Pattern-based MT method

 All the knowledge needed for translation are treated as

translation patterns

 Grammars and word dictionaries can be registered in the same

way to our system because they are both treated as translation patterns

slide-3
SLIDE 3

3

Yakushite Net

 Pattern-based MT  Collaborative translation environment

 Users collaborate to improve the translation

accuracy

 To improve the translation accuracy;

 Our system has various communities  Each community has a dictionary  Users register dictionary data to dictionaries of

relevant communities

slide-4
SLIDE 4

4

Structure of Communities

Root technology computer science hobby

.......

General dictionary science dictionary hobby dictionary technology dictionary hardware software programming java perl computer dictionary hardware dictionary software dictionary programming dictionary java dictionary perl dictionary electronics electronics dictionary

tree structure

slide-5
SLIDE 5

5

Structure of Communities

Root technology computer science hobby

.......

General dictionary science dictionary hobby dictionary technology dictionary hardware software programming java perl computer dictionary hardware dictionary software dictionary programming dictionary java dictionary perl dictionary electronics electronics dictionary

tree structure

slide-6
SLIDE 6

6

Structure of Communities

Root technology computer science hobby

.......

General dictionary science dictionary hobby dictionary technology dictionary hardware software programming java perl computer dictionary hardware dictionary software dictionary programming dictionary java dictionary perl dictionary electronics electronics dictionary

tree structure

slide-7
SLIDE 7

7

Technologies in Yakushite Net

 Automatic dictionary acquisition  Determination of dictionaries, texts and

communities

 Multilingual processing

slide-8
SLIDE 8

8

Architecture of Our System

system dictionary morphological analyzer source sentences parser/generator post generator translated sentences user dictionary general dictionry morphological synthesizer translation engine dictionary failure recovery dictionary

The sentence is parsed using the translation patterns in the dictionaries

slide-9
SLIDE 9

9

Translation Patterns

 Rules of Context-free Grammar (CFG) are paired

 CFG is a formal grammar in which every production rule is of

the form “V -> w”

 Examples of CFG rules

Japanese : S -> Sintr English : S -> Sintr ?

 Examples of translation patterns

[ja:S [1:SIntr:*] ] [en:S [1:SIntr:*] ?:pos=punc];

 The mandatory numerical index allows elements between

source and target patterns to be related

 Source language patterns are used for analysis.

(In Japanese-English translation, “ja” is source language and “en” is target language) ja:S SIntr ? en:S SIntr

slide-10
SLIDE 10

10

Parsing and generating method

Source Target S S

slide-11
SLIDE 11

11

Parsing and generating method

Source Target S S

slide-12
SLIDE 12

12

Parsing and generating method

Source Target S S

VP か VP 行く

slide-13
SLIDE 13

13

Parsing and generating method

Source Target S S

slide-14
SLIDE 14

14

Parsing and generating method

Source Target S S

slide-15
SLIDE 15

15

Parsing and generating method

Source Target S S

slide-16
SLIDE 16

16

Parsing and generating method

Source Target S S

slide-17
SLIDE 17

17

Parsing and generating method

Source Target S S

slide-18
SLIDE 18

18

Parsing and generating method

Source Target S S

slide-19
SLIDE 19

19

Parsing and generating method

Source Target S S

slide-20
SLIDE 20

20

Parsing and generating method

 Word sequences are reduced to a root of a

parse tree (“S”) by applying patterns

 When word sequences reach “S”, the source

parse tree is completed

 each node using the corresponding target

language pattern is converted

 Generation of the target parse tree is carried

  • ut immediately after the parse tree is

completed

slide-21
SLIDE 21

21

Priority Control of Translation

 A parsing tree

 prioritized by the combination of criteria

(ex. number of selected patterns)

 A translation pattern

 prioritized with an priority control mark

 Failure Recovery Dictionary

 becomes active only when the normal parsing

process failed

slide-22
SLIDE 22

22

The Results for IWSLT2005

 Description of the planned training methods  Results

 Performance for training data  Result for test data

 Examples of registered translation pattern

and translation results

slide-23
SLIDE 23

23

Description of the Planned Training Methods

 Not cover much of expressions seen in BTEC  We manually made translation patterns that

are highly generalized

  • 1. we manually extracted frequently used

expressions in the IWSLT05 training corpus

  • 2. we patternized those expressions and gave them

appropriate translations

  • 3. we made corrections to the existing patterns
  • 4. we registered the new patterns to our system
slide-24
SLIDE 24

24

Performance for Training Data(IWSLT04 Test Set)

(1) Before registering new patterns (2) After registering them (3) After we extracted the parallel texts with one Japanese sentence from IWSLT05 training corpus and IWSLT04 test corpus, and registered them BLEU NIST WER PER (1) 0.1918 6.2283 0.6470 0.5640 (2) 0.2179 6.7882 0.5989 0.5183 (3) 0.7616 12.5216 0.2216 0.1894

slide-25
SLIDE 25

25

Result for Test Data (IWSLT05 Test Set)

(1) Before we registered the new patterns (2) After we registered the new patterns (3) After we extracted the parallel texts with one Japanese sentence from IWSLT05 training corpus and IWSLT04 test corpus, and registered them BLEU NIST WER PER (1) 0.1918 6.3279 0.6749 0.5624 (2) 0.2222 6.8913 0.6314 0.5258 (3) 0.2639 7.3585 0.6066 0.5065

slide-26
SLIDE 26

26

Examples of Registered Translation Pattern and Translation Results(1/2)

IWSLT05_JE_training: Japanese : ボール (booru) を (wo) よく (yoku) 見 (mi) て (te) 。 Translation result (1) : You see a ball well and. English : Watch your ball carefully. Japanese : つかまえ (tsukamae) て (te) 。 Translation result (1) : It catches it and. English : Catch him. Extracted expression:

  • te form of verbs (conjugated form that leads declinable

words) + particle "te( て )" or "de( で )" make imperatives.

slide-27
SLIDE 27

27

Examples of Registered Translation Pattern and Translation Results(2/2)

Registered translation pattern: ![ja:SImp [1:VP:*:inf=ry:pos=ds] て :pos=sj] [en:SImp [1:VP:*:conjug=bare] ]; IWSLT05_JE_TESTSET: Japanese : 警察 (keisatsu) を (wo) 呼ん (yon) で (de) 。 Translation result (1) : It calls police and. Translation result (3) : Call police. Japanese : 芝生 (shibahu) に (ni) 入ら (haira) ない (nai) で (de) 。 translation result (1) : It does not enter a lawn and. translation result (3) : Do not enter the lawn.

slide-28
SLIDE 28

28

Conclusion

 We presented our pattern-based MT method

 Enables easier registration of phrasal expressions

and grammatical knowledge

 We described how we dealt with the task

 We dealt with the task mainly manually

 Future study

 Adoption of an automatic dictionary acquisition

technology

slide-29
SLIDE 29

29

Example of Translation(1/3)

Japanese : 「彼はどこに行くか」 English : “Where does he go?” [ja:VP 行く :*:pos=ds] [en:VP go:*:pos=v]; [ja:VP:jSentenceType=interrogative [1:VP:*] か :pos=ej] [en:VP [1:VP:*]];

VP か VP 行く

slide-30
SLIDE 30

30

Example of Translation(2/3)

Japanese : 「彼はどこに行くか」 English : “Where does he go?” [ja:NP:personNum=3sg 彼 :*:pos=ms] [en:NP he:*:pos=prn]; [ja:NPJoshi:case=subj [2:NP:*] は :pos=fj] [en:NP [2:NP:*:case=subj] ]; [ja:FsIntr どこに :*:pos=fs] [en:AdvIntr where:*:pos=adv];

NP は NPJoshi 彼 どこに FsIntr

slide-31
SLIDE 31

31

Example of Translation(3/3)

Japanese : 「彼はどこに行くか」 English : “Where does he go?” [ja:SIntr [2:NPJoshi:case=subj:personNum=3sg] [1:FsIntr] [3:VP:* :jSentenceType=interrogative] ] [en:SIntr [1:AdvIntr] do:pos=v:personNum=3sg [2:NP] [3:VP:*] ]; [ja:S [1:SIntr:*] ] [en:S [1:SIntr:*] ?:pos=punc];

NPJoshi FsIntr VP SIntr S AdvIntr “do” NP VP SIntr “?” S “where” “do” “he” “go”