Building Ubiquitous and Robust Speech and Natural Language I nterfaces I
Gary Geunbae Lee, Ph.D., Professor
- Dept. CSE, POSTECH
Building Ubiquitous and Robust Speech and Natural Language I - - PowerPoint PPT Presentation
Building Ubiquitous and Robust Speech and Natural Language I nterfaces I Gary Geunbae Lee, Ph.D., Professor Dept. CSE, POSTECH Contents PART-I: Statistical Speech/Language Processing (60min) Natural Language Processing short intro
2 IUI 20 0 7 tutoria l
– Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
3 IUI 20 0 7 tutoria l
4 IUI 20 0 7 tutoria l
5 IUI 20 0 7 tutoria l
6 IUI 20 0 7 tutoria l
made of donut?
did?
7 IUI 20 0 7 tutoria l
8 IUI 20 0 7 tutoria l
9 IUI 20 0 7 tutoria l
10 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
11 IUI 20 0 7 tutoria l
12 IUI 20 0 7 tutoria l
L W∈
L W∈
L W∈
13 IUI 20 0 7 tutoria l
Where is the bus stop ? Speech Signals Word Sequence Wher is the bus stop ?
HMM Estimation G2P LM Estimation
L W∈
14 IUI 20 0 7 tutoria l
I L S A M 일 이 삼 사 I L I S A M S A 삼 사 일 이
Acoustic Model Pronunciation Model Language Model
I I L S A M Word transition P(일|x) P(사|x) P(삼|x) P(이|x) LM is applied S A start end 이 일 사 삼 Between-word transition Intra-word transition
Search Network
15 IUI 20 0 7 tutoria l
16 IUI 20 0 7 tutoria l
17 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
18 IUI 20 0 7 tutoria l
ASR Speech SLU SQL Generate Database Text Semantic Frame SQL Response A typical ATIS system (from [Wang et al., 2005])
19 IUI 20 0 7 tutoria l
ShowFlight Subject Flight FLIGHT Departure_City Arrival_City SEA BOS <frame name=‘ShowFlight’ type=‘void’> <slot type=‘Subject’> FLIGHT</slot> <slot type=‘Flight’/> <slot type=‘DCity’>SEA</slot> <slot type=‘ACity’>BOS</slot> </slot> </frame>
Semantic representation on ATIS task; XML format (left) and hierarchical representation (right) [Wang et al., 2005]
20 IUI 20 0 7 tutoria l
21 IUI 20 0 7 tutoria l
2004; Eun et al., 2005; Jeong and Lee, 2006]
22 IUI 20 0 7 tutoria l
Raw data Small Labeled data Model Predict & Estimate Confidence < threshold Active Learning Filter Labeled samples > threshold
yes no
Augmented data
23 IUI 20 0 7 tutoria l
Dialog Act Identification Dialog Act Identification Frame-Slot Extraction Frame-Slot Extraction Relation Extraction Relation Extraction Unification Unification Feature Extraction / Selection Feature Extraction / Selection Info. Source Info. Source + + + + + + + + + +
Overall architecture for semantic analyzer
I like DisneyWorld. Domain: Chat Dialog Act: Statement Main Action: Like Object.Location=DisneyWorld
Examples of semantic frame structure
How to get to DisneyWorld? Domain: Navigation Dialog Act: WH-question Main Action: Search Object.Location.Destination=DisneyWorld
24 IUI 20 0 7 tutoria l
y yt
t-
1
y yt
t
y yt+1
t+1
x xt
t-
1
x xt
t
x xt+1
t+1
25 IUI 20 0 7 tutoria l
… … … … fly fly from from denver denver to to chicago chicago
dec dec. . 10th 10th 1999 1999 DEPART.MONTH … …
… …
return return from from denver denver to to chicago chicago
dec dec. . 10th 10th 1999 1999 RETURN.MONTH
26 IUI 20 0 7 tutoria l
27 IUI 20 0 7 tutoria l
28 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
29 IUI 20 0 7 tutoria l
30 IUI 20 0 7 tutoria l
Semantic Meaning
ORIGIN_CITY: WASHINGTON DESTINATION_CITY: DENVER FLIGHT_TYPE: ROUNDTRIP
Dialog Management
System Action
GET DEPARTURE_DATE
Response Generation
System Speech
Which date do you want to fly from Washington to Denver? Automatic Speech Recognition
User Speech
“I need a flight from Washington DC to Denver roundtrip”
Recognized Sentence
Spoken Language Understanding
31 IUI 20 0 7 tutoria l
32 IUI 20 0 7 tutoria l
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="login"> <field name="phone_number" type="phone"> <prompt> Please say your complete phone number </prompt> </field> <field name="pin_code" type="digits"> <prompt> Please say your PIN code </prompt> </field> <block> <submit next=“http://www.example.com/servlet/login” namelist=phone_number pin_code"/> </block> </form> </vxml> Browser : Please say your complete phone number User : 800-555-1212 Browser : Please say your PIN code User : 1 2 3 4
33 IUI 20 0 7 tutoria l
Example 1)
travel?
Example 2)
in the morning.
connection …
34 IUI 20 0 7 tutoria l
– Complex communication using unrestricted natural language – Mixed-Initiative – Co-operative problem solving – Theorem proving, planning, distributed architectures – Conversational agents
User : I’m looking for a job in the Calais area. Are there any servers? System : No, there aren’t any employment servers for Calais. However, there is an employment server for Pasde-Calais and an employment server for Lille. Are you interested in one of these?
35 IUI 20 0 7 tutoria l
Message-passing protocol Hub and Clients architecture
36 IUI 20 0 7 tutoria l
37 IUI 20 0 7 tutoria l
38 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
39 IUI 20 0 7 tutoria l
40 IUI 20 0 7 tutoria l
– Including aspects of common context – (e.g., participants, common ground, linguistic and intentional structure,
– How to model the information components – (e.g., as lists, sets, typed feature structures, records, etc.)
41 IUI 20 0 7 tutoria l
– Trigger the update of the information state – Be correlated with externally performed actions
– Govern the updating of the information state
– For deciding which rules to apply at a given point from the set of applicable
42 IUI 20 0 7 tutoria l
43 IUI 20 0 7 tutoria l
44 IUI 20 0 7 tutoria l
Information + Origin Information + Destination Information + Origin + Dest. Information + Date Information + Origin + Date Information + Origin + Dest +Date Information + Dest + Date Flight # Flight # + Date Flight # + Information Flight # + Reservation
45 IUI 20 0 7 tutoria l
– Achieve an application goal to minimize a cost function (=objective function) – In General
– To minimize the turn of user-system and the DB access until filling all slots
– Designing a dialog system that gets a correct date (month and day) from a user through the shortest possible interaction – Objective Function
f e i D
46 IUI 20 0 7 tutoria l
Dialog Action
(Prompts, Queries, etc.)
(User, External DB or other Servers)
Dialog State Cost
(Turn, Error, DB Access, etc.)
47 IUI 20 0 7 tutoria l
Strategy 1 is optimal if wi + P2* we - wf > 0 Recognition error rate is too high
1
1
* * C
f i
ω ω + = P * 2 * 3
2 3
* * C
f e i
ω ω ω + + =
Strategy 3 is optimal if 2*(P1-P2)* we - wi > 0 P1 is much more high than P2 against a cost of longer interaction
P * 2 * 2
1 2
* * C
f e i
ω ω ω + + =
48 IUI 20 0 7 tutoria l
– Instead, the system makes observations about the outside world which give incomplete information about the true current state.
– Belief State : A distribution over MDP states in the absence of knowing its state exactly .
) , | ( ) ( ) , | ' ( ) , ' | ( ) , , | ' ( ) ' ( b a
s b s a s p a s
b a
p s b
S s
∈
= =
∈
S s
Current State Reward Function Next State
49 IUI 20 0 7 tutoria l
– Example-based techniques using dialog example database (DEDB). – This model is simple and domain portable.
– Query key : user intention, semantic frames, discourse history.
– Utterance similarity Measure
– Lexico-Semantic Similarity : Normalized edit distance – Discourse History Similarity : Cosine similarity
50 IUI 20 0 7 tutoria l
– Lexical-based example database needs much more examples. – The SLU results is the most important index key.
Utterance 그럼 SBS 드라마는 언제 하지? (when is the SBS drama showing?) Dialog Act Wh-question Main Action Search_start_time Component Slots [channel = SBS, genre =drama] Discourse History [1,0,1,0,0,0,0,0,0] System Action Inform(date, start_time, program)
51 IUI 20 0 7 tutoria l
User Utterance 그럼 SBS 드라마는 언제 하지? (when is the SBS drama showing?) Component Slots [channel = SBS, genre = 드라마(dramas)] Lexico-Semantic Representation 그럼 [channel] [genre] 는 언제 하 지
그럼 [channel] [genre] 는 언제 하 지 Slot-Filling Vector : [1,0,1,0,0,0,0,0,0] [date] [genre] 는 몇 시에 하 니 Slot-Filling Vector : [1,0,0,1,0,0,0,0,0] Current User Utterance Retrieved Examples
Lexico-Semantic Similarity Discourse History Similarity
52 IUI 20 0 7 tutoria l
Dialogue Corpus Dialogue Example DB Domain Expert
Automatic Indexing Retrieval Discourse History
Dialogue Examples
Tie-breaking
Lexico-semantic Similarity Discourse history Similarity
Utterance Similarity
Semantic Frame
Best Dialogue Example
User Intention
Dialogue Corpus Dialogue Corpus Dialogue Example DB Domain Expert
Automatic Indexing Retrieval Discourse History
Dialogue Examples
Tie-breaking
Lexico-semantic Similarity Discourse history Similarity
Utterance Similarity
Lexico-semantic Similarity Discourse history Similarity
Utterance Similarity
Semantic Frame
Best Dialogue Example
User Intention
53 IUI 20 0 7 tutoria l EPG DEDB EPG Dialog Corpus
EPG Expert
Discourse History Stack
semantic frame
Frame-Slot Extraction (EPG) Dialog Act Identification Discourse Inference USER : What is on TV now? USER : What is on TV now?
Retrieved Dialog Examples
similarity
System Response
EPG Meta-Rule
XML Rule Parser
retrieved, meta-rules are used.
Domain Spotter Agent Spotter
SYSTEM : “XXX” is on SBS, ….. SYSTEM : “XXX” is on SBS, …..
Web Contents
Database Manager
TV Schedule Database
54 IUI 20 0 7 tutoria l
Dialogue Move Engine Toolkit”, Natural Language Engineering, vol. 6, no. 3-4, pp. 323- 340, 2000
for learning dialog strategies”, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 1, pp. 11-23, 2000.
in-domain confidence and discourse coherence measures, IEICE Transactions on Information and Systems, 89(3):931-938.
Management using Dialogue Examples. ICASSP.
Together: A Unified Example-based Architecture for Multi-Domain Dialog Management, Proceedings of the IEEE/ACL 2006 workshop on spoken language technology (SLT), Aruba.
ICUM, pp55-64.
spoken dialog management in Java. Speech Communication, 54(1):99-124.
framework for evaluating spoken dialogue agents. ACL/EACL, pp271-280.
55 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
56 IUI 20 0 7 tutoria l
– The biggest problem to use dialog systems in a practical field is
– Practical Dialog Systems need:
– Easy and Fast Dialog Modeling to handle new patterns of dialog – Easy to build up new information sources – TV-Guide domain needs new TV-Schedule everyday – Reduce human efforts for maintaining – All dialog components should be synchronized! – Easy to tutor the system – Semi-automatic learning ability is necessary. – Human can’t teach everything.
– Rapid application development; CSLU Toolkit [CSLU Toolkit] – Scheme design & management; SGStudio [Wang and Acero, 2005] – Help non-experts in developing a user interface; SUEDE [Anoop et al., 2001]
57 IUI 20 0 7 tutoria l
– Tutor the dialog system by adding & editing dialog examples – Synchronize all dialog components
– ASR + SLU + DM + Information Accessing
– Providing semi-automatic learning ability – Reducing human-efforts for building up or maintaining dialog systems.
– Generate Possible Dialog Candidates from Corpus – Predicting the possible dialog tagging information using a current model – Human approving or disapproving.
58 IUI 20 0 7 tutoria l
– Can be supported by the System using old models.
– DUP automatically generates the instances.
– Administrator can audit DUP and modify the instances.
– ASR, SLU models are automatically trained
Dialog Example Editing
(Automatically generated example candidates)
New Corpus Generation Example-DB Indexing Recommendation Generation Audit & Modify
59 IUI 20 0 7 tutoria l
6 0 IUI 20 0 7 tutoria l
6 1 IUI 20 0 7 tutoria l
6 2 IUI 20 0 7 tutoria l
6 3 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
64 IUI 20 0 7 tutoria l
6 5 IUI 20 0 7 tutoria l
66 IUI 20 0 7 tutoria l
6 7 IUI 20 0 7 tutoria l
6 8 IUI 20 0 7 tutoria l
69 IUI 20 0 7 tutoria l
70 IUI 20 0 7 tutoria l
71 IUI 20 0 7 tutoria l
72 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
73 IUI 20 0 7 tutoria l
74 IUI 20 0 7 tutoria l
– Emotion is a part of User Context.
– It has been recognized as one of the most significant factor of people to communicate with each other. [T. Polzin, 2000]
– Application : Affective HCI (Human-Computer Interface)
– Home Networking, Intelligent Robot, ChatBot, … “ “I feel blue I feel blue today. today.” ” “ “Do you need a Do you need a cheer cheer-
up music? " “ “what up? what up?” ”
75 IUI 20 0 7 tutoria l
USER : I am very happy. USER : I am very happy. Facial Expression Analysis Speech Analysis Linguistic Analysis
Emotion Hypothesis
Facial Expression Speech Text
76 IUI 20 0 7 tutoria l
Emotional Speech DB
Call Center
Tutor System
Chat Messenger
Neutral, Happy, Sad, Surprise, Afraid, Disgusted, Bored, …
77 IUI 20 0 7 tutoria l
– Such as pitch, energy, and speech rate of the utterance,
Acoustic-Prosodic Fundamental Frequency(f0) – max, min, mean, standard deviation Energy – max, min, mean, standard deviation Speaking Rate – voice frame/total frame Pitch Contour ToBI Contour, nuclear pitch accent, phrase+boundary tones Voice Quality Spectral tilt
78 IUI 20 0 7 tutoria l
– People tend to use specific words to express their emotions in spoken dialogs.
– Because they have learned how some words are related to the corresponding emotions.
– They identified emotional keyword in spoken language. – It is highly domain dependent.
Lexical N-gram (Unigram, Bigram, Trigram) Non-speech human noise : laughter and sighs Filled Pause : “Oh” Emotional Keyword Pragmatic Dialog Act, User Identifier Context Past observances in the previous user turns
79 IUI 20 0 7 tutoria l
– Most of emotion detection system uses the same basic algorithm used in the text categorization or topic detection.
– Using mutual information
– Using statistical classification methodologies
– Linear discriminant classifier (LDC) – K-nearest neighborhood (k-NN) – Support Vector Machine (SVM)
× × =
t i i j i i j i j i
n N tf n N tf w
1 2 , , ,
)] / log( [ ) / log(
=
= = =
k j j n n j n n
) e , w ( i ) w | e ( P ) w W ; E ( I ) w ( sal
1
8 0 IUI 20 0 7 tutoria l
USER : Did you watch a Korean team’s game yesterday? SYSTEM : What game?. USER : How come you didn’t know that? SYSTEM : Well, I don’t know.
8 1 IUI 20 0 7 tutoria l
USER : I had a date last week Dialog Act = statement-non-opinion Main Goal = perform Agent = chat Domain = friend date = last week event = date SYSTEM : Did you like her? USER : Well, I think so, but. Dialog Act = statement-opinion Main Goal = judge Agent = chat Domain = friend SYSTEM : but, what? USER : I don’t know her real feeling. Dialog Act = statement-opinion Main Goal = know Agent = chat Domain = friend theme = real feeling SYSTEM : Why? What happen?
Utterance 1 Utterance 2 Utterance 3
DA = statement-non-opinion MA = perform DOMAIN = friend Date = 1 Theme = 1 DA = statement-opinion MA = judge DOMAIN = friend DA = statement-opinion MA = know DOMAIN = friend Theme = 1
Context-Free Index Key
Discourse Coherence
θ > − − − = rform)
non ent dge|statem
t P(statemen ) ,MA |DA ,MA P(DA
t- t- t t 1 1
Context-Sensitive Index Key
Previous Semantics = “statement-non-opinion,perform” Previous Keyword = “date” Scenario Session = “2” DA = statement-opinion MA=judge DOMAIN=friend Previous Semantics = “statement-opinion,judge” Previous Keyword = “NULL” Scenario Session = “2” DA = statement-opinion MA=know DOMAIN=friend Previous Semantics = “<s>,<s>” Previous Keyword = “date” DA = statement-non-opinion MA = perform DOMAIN = friend Date = 1 Theme = 1
Abstraction of previous user turn
8 2 IUI 20 0 7 tutoria l
8 3 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
8 4 IUI 20 0 7 tutoria l
8 5 IUI 20 0 7 tutoria l
What is a decent Japanese restaurant near here?.
8 6 IUI 20 0 7 tutoria l
Speech Gesture
Face Expression Uni-modal Understanding Multi-modal Understanding & reference analysis Discourse Understanding
Uni-modal interpretation frame Uni-modal interpretation frame Multi-modal interpretation frame
8 7 IUI 20 0 7 tutoria l
8 8 IUI 20 0 7 tutoria l
8 9 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
9 0 IUI 20 0 7 tutoria l
9 1 IUI 20 0 7 tutoria l
– Text normalization: take raw text and convert things like numbers and abbreviations into their written-out word equivalents. – Linguistic analysis: POS-tagging, grapheme-to-phoneme conversion – Prosody generation: pitch, duration, intensity, pause
– Unit selection: select the most similar units in speech DB to make actual sound
(Symbolic linguistic representation)
9 2 IUI 20 0 7 tutoria l
_ | _ e _ _ yo gg g a h
Phonemes:
| | | | | | | | ㅔ ㅇ _ ㅛ ㄱ ㄱ ㅏ ㅎ
Graphemes:
<Rule Generation> Alignment Rule extraction Rule pruning Rule association Dictionary <G2P Conversion> Text normalizer Input text Canonical form of graphemes Phonemes
9 3 IUI 20 0 7 tutoria l
Probabilistic break index prediction C4.5
Break index tagged POS tag sequence Break index tagged POS tag sequence POS tag sequence
Trigram (wtag wtag break wtag) Decision tree for error correction
94 IUI 20 0 7 tutoria l
– POS, punctuation type, the length of phrase, onset, nucleus, coda
– POS, the length of phrase, the location in prosodic phrase
9 5 IUI 20 0 7 tutoria l
96 IUI 20 0 7 tutoria l
9 7 IUI 20 0 7 tutoria l
9 8 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
99 IUI 20 0 7 tutoria l
10 0 IUI 20 0 7 tutoria l
10 1 IUI 20 0 7 tutoria l
10 2 IUI 20 0 7 tutoria l
10 3 IUI 20 0 7 tutoria l
10 4 IUI 20 0 7 tutoria l
– Makes English fluently
– Makes translation correctly
– Finds best sentence
e e best
10 5 IUI 20 0 7 tutoria l
i i i e
10 6 IUI 20 0 7 tutoria l
– Pruning Reduces search space – Threshold pruning & Beam search algorithms n: f: ----- P: 1.0 n: I f: *---- P: 0.5 n: think f: -*--- P: 0.4 n: am f: *---* P: 0.13 n: think f: **--- P: 0.25
10 7 IUI 20 0 7 tutoria l
– Most famous metric – Range 0~1. – Higher score means better translation
=
N n n n
1
n:
n: weight
−
c r
) / 1 (
10 8 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
10 9 IUI 20 0 7 tutoria l
110 IUI 20 0 7 tutoria l
111 IUI 20 0 7 tutoria l
– Not a syntactic phrase – A sequence of contiguous words
112 IUI 20 0 7 tutoria l
113 IUI 20 0 7 tutoria l
– ai : start position of the foreign phrase that was translated into the ith English phrase – bi: end position of the foreign phrase that was translated into the (i-1)th English phrase
= −
I i i i i i I I
1 1 1 1
I
i i
| 1 | 1
1
− − −
−
i i b
a i i
114 IUI 20 0 7 tutoria l
K-E 생맥 주 한 잔 주 세요 . A Draft Beer , Please . E-k 생맥 주 한 잔 주 세요 . A Draft Beer , Please . Inter-sect 생맥 주 한 잔 주 세요 . A Draft Beer , Please . Inter-sect 생 맥 주 한 잔 주 세 요 . A Draft Beer ? , Please ? .
115 IUI 20 0 7 tutoria l
Inter-sect 생 맥 주 한 잔 주 세 요 . A Draft Beer , Please .
116 IUI 20 0 7 tutoria l
117 IUI 20 0 7 tutoria l
– E.g. (morpheme connectivity check)
118 IUI 20 0 7 tutoria l
– Add part-of-speech (POS) tags to the training data
– Distinguish some of the homonyms – Change spacing unit
– For many languages, automatic POS tagging is available. – Spacing unit is changed into unit of meaning
119 IUI 20 0 7 tutoria l
– For some language pairs, there are useless words for translation. – Delete useless words to help word alignment
– Reduce the number of misaligned pairs
– English : the, a, an, -es Korean has a tendency not to distinguish number in noun – Korean : some kinds of post-positions ( 은, 는, 이, 가, 을, 를, …) English does not have case-markers
120 IUI 20 0 7 tutoria l
– Just append the dictionary to the end of parallel corpus
– Add one count for correct phrase pairs in the dictionary – Increase the coverage of vocabulary
– Usually, a dictionary is easily accessible – Already built in web or other applications – Adding dictionary gives significant improvement.
121 IUI 20 0 7 tutoria l
Korean/English Korean/English Bilingual Text Bilingual Text English English Text Text
122 IUI 20 0 7 tutoria l
– Natural Language Processing – short intro – Automatic Speech Recognition – (Spoken) Language Understanding
– Spoken Dialog Systems – Dialog Management – Dialog Studio – Information Access Dialog – Emotional & Context-sensitive Chatbot – Multi-modal Dialog – Conversational Text-to-Speech
– Statistical Machine Translation – Phrase-based SMT – Speech Translation
123 IUI 20 0 7 tutoria l
– Automatic Speech Recognizer – Generate texts of given speech signal.
– Text-To-Speech – Synthesis sounds of given text.
– Translate speech signal in a language into another language – Combining ASR, TTS and Machine Translation
124 IUI 20 0 7 tutoria l
– Connect ASR, SMT and TTS in cascading manner – The ASR result be a input for the SMT system. – Translation result from SMT system be a input for TTS system. – Simple!
125 IUI 20 0 7 tutoria l
ASR ASR Result1 Result1 ASR ASR Result2 Result2 ASR ASR Result3 Result3 ASR ASR Result4 Result4 ASR ASR Result n Result n SMT SMT Result1 Result1 SMT SMT Result2 Result2 SMT SMT Result3 Result3 SMT SMT Result4 Result4 SMT SMT Result n Result n
126 IUI 20 0 7 tutoria l
127 IUI 20 0 7 tutoria l
128 IUI 20 0 7 tutoria l
129 IUI 20 0 7 tutoria l