Multi Language Support for Virtual Assistants Prise en charge - - PowerPoint PPT Presentation

multi language support for virtual assistants
SMART_READER_LITE
LIVE PREVIEW

Multi Language Support for Virtual Assistants Prise en charge - - PowerPoint PPT Presentation

Multi Language Support for Virtual Assistants Prise en charge multilingue pour les assistants virtuels Supporto multilingue per


slide-1
SLIDE 1

가상 어시스턴트를위한 다국어 지원 April 2020

Soporte multilenguaje para asistentes virtuales

对虚拟助手的多语言支持

Supporto multilingue per assistenti virtuali

یزاجم نارایتسد یارب هنابز دنچ ینابیتشپ

Prise en charge multilingue pour les assistants virtuels

Suporte em vários idiomas para assistentes virtuais

Multi Language Support for Virtual Assistants

वरॎचुअल असिसॎटेःट क े सलए मलॎटी लैःगॎवेज िपोटु

仮想アシスタントの多言語サポート

slide-2
SLIDE 2

Overview

slide-3
SLIDE 3

Overview

  • Extending the current

capabilities of Almond to other languages in a cost and time efficient manner

  • Avoiding template development

for each new language

Goals:

slide-4
SLIDE 4

Overview

  • Extending the current

capabilities of Almond to other languages in a cost and time efficient manner

  • Avoiding template development

for each new language

Goals: Solution:

Data collection strategy:

  • Using neural machine

translation models to produce translated sentences

  • Improving translation quality

using domain-dependent rules Training strategies:

  • Joint and sequential training
  • Enforcing low variance on

encoded outputs on same sentences from different languages

slide-5
SLIDE 5

Data Collection method

slide-6
SLIDE 6

Data Collection method

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify

Sentence Program English Dataset

slide-7
SLIDE 7

Data Collection method

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify

Sentence Program English Dataset Pre-Processing

slide-8
SLIDE 8

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify

Sentence Program English Dataset Pre-Processing

slide-9
SLIDE 9

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Sentence Program English Dataset Pre-Processing

slide-10
SLIDE 10

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Post-Processing Sentence Program English Dataset Pre-Processing Feedback Collection & Rule Generation

slide-11
SLIDE 11

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Post-Processing Sentence Program English Dataset Pre-Processing Feedback Collection & Rule Generation

  • Detokenize punctuation
  • Replace NUMBER with actual values
  • Lower case all parameter values

  • Replace verbs with their imperative form
  • Insert missing prepositions
  • Replace translated parameter values with real values

from target language …

Post-processing rules Pre-processing rules

slide-12
SLIDE 12

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Post-Processing Sentence Program English Dataset Pre-Processing Feedback Collection & Rule Generation

  • Detokenize punctuation
  • Replace NUMBER with actual values
  • Lower case all parameter values

  • Replace verbs with their imperative form
  • Insert missing prepositions
  • Replace translated parameter values with real values

from target language …

Post-processing rules Pre-processing rules

slide-13
SLIDE 13

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Post-Processing Sentence Program English Dataset Pre-Processing Feedback Collection & Rule Generation

  • Detokenize punctuation
  • Replace NUMBER with actual values
  • Lower case all parameter values

  • Replace verbs with their imperative form
  • Insert missing prepositions
  • Replace translated parameter values with real values

from target language …

Parameter Matching Post-processing rules Pre-processing rules

slide-14
SLIDE 14

Data Collection method

Neural Machine Translation Model (e.g. Google Translate)

display all review descriptions authored by Jennifer . now => [description] of @restaurant.review, author == " Jennifer ") => notify muestra todas las descripciones de las reseñas creadas por " Jennifer ".

Post-Processing

muestra todas las descripciones de las reseñas escritas por juan . now => [description] of @restaurant.review, author == " juan ") => notify

Sentence Program Dataset in target language English Dataset Pre-Processing Feedback Collection & Rule Generation

  • Detokenize punctuation
  • Replace NUMBER with actual values
  • Lower case all parameter values

  • Replace verbs with their imperative form
  • Insert missing prepositions
  • Replace translated parameter values with real values

from target language …

Parameter Matching Post-processing rules Pre-processing rules

slide-15
SLIDE 15

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。

slide-16
SLIDE 16

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder

slide-17
SLIDE 17

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder

slide-18
SLIDE 18

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder now => [description] of @restaurant.review, author == " Jennifer ") => notify

slide-19
SLIDE 19

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss

slide-20
SLIDE 20

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss

slide-21
SLIDE 21

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss

slide-22
SLIDE 22

Naive Training

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Encoder Decoder now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss We are not using the “knowledge” that these sentences are semantically equivalent

slide-23
SLIDE 23

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。

slide-24
SLIDE 24

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Batching

slide-25
SLIDE 25

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Batching Encoder

slide-26
SLIDE 26

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Batching Encoder Decoder

slide-27
SLIDE 27

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 Batching Encoder Loss Encoder Decoder

slide-28
SLIDE 28

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 now => [description] of @restaurant.review, author == " Jennifer ") => notify Batching Encoder Loss Encoder Decoder

slide-29
SLIDE 29

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss Batching Encoder Loss Encoder Decoder

slide-30
SLIDE 30

Training with sentence batching

display all review descriptions authored by Jennifer . muestra todas las descripciones de las reseñas creadas por Jennifer . 显示Jennifer撰写的所有评论描述。 now => [description] of @restaurant.review, author == " Jennifer ") => notify Decoder Loss Batching Encoder Loss Encoder Decoder We now use both losses to guide the training

slide-31
SLIDE 31

Experiment results (Farsi)

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% Translated Verified New Params Test

Exact Match Accuracy

slide-32
SLIDE 32

Challenges

slide-33
SLIDE 33

Challenges

  • Google translate is not perfect
  • Identifying Language specific traits (single/ plural, missing prepositions, ...)
  • Closing the gap between evaluation accuracy and test (real data) accuracy
  • Automating and improving collection of natural parameter values for each language
  • ...
slide-34
SLIDE 34

Challenges

  • Google translate is not perfect
  • Identifying Language specific traits (single/ plural, missing prepositions, ...)
  • Closing the gap between evaluation accuracy and test (real data) accuracy
  • Automating and improving collection of natural parameter values for each language
  • ...

Bonus:

  • Started code is available free of charge!
  • 18/6 project technical support
  • Optional happy hours to celebrate our results
  • Will be featured as a contributor in our EMNLP paper