Captioning Events in Tourist Spots by Neural Language Generation Mai - PowerPoint PPT Presentation

Captioning Events in Tourist Spots by Neural Language Generation Mai Nguyen 1 , Koichiro Yoshino 1,2 , Yu Suzuki 1,3 , Satoshi Nakamura 1 1 Nara Institute of Science and Technology 2 Japan Science and Technology Agency 3 Gifu University 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 1

Outline • Motivations • Related works • System Architecture • Experiments • Conclusion 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 2/23

Motivations Fushimi Inari is very crowded, you should go there in early morning Language Generation Natural Language Generation (NLG) will be a good interface to output info. for user User can understand information immediately • Giving textual description is one of the easiest ways to present information • Data sources 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 3/23

Objectives The task of generating textual descriptions about tourist spots to support user decision making in tourist domain: Important factors: 1. Informativeness 2. Naturalness 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 4/23

Related works The task of generating text from data has been investigated in many domains 1. Weather forecast (Belz et al., 2008) 2. Navigation assistance (Dale et al., 2003) 3. Sports (Liang et al., 2009) 4. Market Comments (Murakami et al., 2017) NLG approaches: Rule-based: Define a set of rules to map frames to NL Pros: simple, error-free, easy to control Cons: time-consuming, poor scalability Pros: Access millions of traveler reviews => Neural-based NLG Cons: Out-of-date information 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 5/23

This paper introduces … • A system for producing textual description about tourist attractions ü Integrate a neural language generation ü Summarize a multiple data resources such as infrared sensor, social media, user’s check-in ü coupling with backend-server, we developed a real-time application with up-to-date information. 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 6/23

NLG Pipeline How to say What to say Language model User’s check in Social media Language Generator Congestion Sentences Semantic representation 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 7/23

System Architecture Neural language Mobile application Pushing notification server Backend system generator What-to-say How-to-say Data sources 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 8/23

What-to-say • Backend server: extract and summarize data from several information sources • Infrared sensor, Twitter, user’s check-in • Pushing server: • Check new-update information from backend with a specific time • Create a semantic expression, N-hot vector of frames that have slot values expression. • Send the pushing request to the back-end server 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 9/23

How-to-say, Neural Language Generation • Utilize semantically controlled LSTM, proposed by [Wen et al, 2015] • Core Idea: using a gate mechanism to control the generated semantics (dialog act/slots) • Works well in limited domain, if the “what to say” is decided 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 10/23

Experiments The target application is a system that describes comprehensive, up-to- date information about tourist attraction by NLG Evaluate usefulness of the Evaluate generated system by traveler using the sentences by automatic application metrics Integrate NLG Build NLG Collect data in the real- models time system 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 11/23

Dataset Attribute Data Type Example Value Data Sources name verbatim string kyoto tower POIS event verbatim string cherry blossom, gion official websites festival state_event dictionary happening, finished official websites crowded dictionary high, low, average infrared sensors recommended boolean yes/no recommendation system time enumerable now, holidays popular boolean yes/no users check-in List of possible attributes in the our dataset, data sources and examples 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 12/23

Data collection 1. Corpus: A pair of meaning representation (MR) and corresponding references pairs in the tourist domain. • Meaning representation (MR): a set of pairs key-value pairs • Reference: natural language utterance describing MR MR name=[Fushimi Inari], crowded=[high], time=[festival days], recommended=[no] It is a good idea to avoid Fushimi Inari during festival days because it is extraordinary Reference crowded. Human annotation: workers were recruited via crowd sourcing service . 2. 3. Results: ~ 3300 data points collected in English 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 13/23

Pre-process dataset • Remove useless instances: which provide not enough information ( Informativeness ) • Grammar correction: detected by Grammarly and fixed manually, ~ 15%. ( Naturalness ) • Delexicalization: some properties such as name and event to avoid data sparsity. "name[maizuru park], event[cherry blossoms], event_state[happening], crowdedness[2]" "maizuru park is not very crowded, the cherry blossoms festival is currently held there." X-name is not very crowded , the X-event festival is currently held there . 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 14/23

Evaluation of NLG • Corpus is split into training, validation, test set in the ratio 85%:6%:9% Training Validation Test REF 2800 200 296 MR 283 80 83 • Training: • character-based version of SC-LSTM (hidden size 1024, batch size 256) • Adam optimizer, • Dropout, • Beam search: beam size = 10 • Baseline: rule-based • Metrics: BLEU, NIST, METEOR, ROUGE-L, CIDEr 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 15/23

Result • Automatic metrics System BLEU NIST METEOR ROUGE-L CIDEr Rule-based 0.41 5.67 0.36 0.67 2.23 Our generator 0.43 6.03 0.37 0.64 2.70 Improvement: ü BLEU: 0.02 ü NIST: 0.36 ü METEOR: 0.01 ü CIDEr: 0.47 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 16/23

Examples of generated text N. of slots Example of description generated corresponding to MR name[nanzenji temple], event[autumn leaves], popular[yes] 3 Autumn leaves in nanzenji temple is popular. name[Yoyogi Park], crowded[no], time[now], recommended[no] 4 Yoyogi Park is not crowded right now, but it is not recommended to visit. name[yanagidani kannon], event[autumn leaves], state[happening], crowded[high], recommended[no] 5 Autumn leaves is happening in yanagidani kannon, it is extremely crowded, you should not go there. name[kyoto imperial palace], event[aoi festival], state[happening], crowded[low], 5 recommended[yes] Aoi festival is happening in kyoto imperial palace, it is medium crowded, you should go there. The NLG model can produce fluent text and would often repeat the same sentence structure multiple times 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 17/23

Results analysis Human Evaluation - in term of informativeness Number of correct No. slots Total of MR Percentage generated text 3 18 17 94.4% 4 54 49 90.7% 5 11 5 45.4% Total 83 70 85.54% Neural LG have difficulty capturing long-term structure MRs 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 18/23

Experiment in Real Field • Place: Arashiyama area, Kyoto • Participants • 12 students • 19 ordinary people • Task descriptions • Use the application during their trip • Visit at least 3 POIs from 10 AM to 3 PM • Answer the questionnaire 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 19/23

Application 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 20/23

Experimental Results Evaluate usefulness of the pushing function in range 1-5: • 5: Very satisfied Answers Experiment Ratio of satisfied • 4: Rather Satisfied 5 4 3 2 1 • 3: Satisfied Students 1 1 5 1 4 58% • 2: Not Satisfied Ordinary 2 1 3 3 1 60 % • 1: Not Satisfied at all people 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 21/23

Conclusions • Our generation model produces fluent text overall, work smoothly on real-time application. • In term of usefulness, our system could support decision-marking for tourists. • Limitations: • Neural LG have difficulty capturing long-term structure because the training data is quite small • The application’s notification was too much for travelers. 14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 22/23

Captioning Events in Tourist Spots by Neural Language Generation Mai - PowerPoint PPT Presentation

Captioning Events in Tourist Spots by Neural Language Generation Mai Nguyen 1 , Koichiro Yoshino 1,2 , Yu Suzuki 1,3 , Satoshi Nakamura 1 1 Nara Institute of Science and Technology 2 Japan Science and Technology Agency 3 Gifu University 14 June

The Role & Responsibilities of Tourist The Role & Responsibilities of Tourist The Role

Wave Sound Space Tourist On-board Protection Space Tourist On-board Protection Space Tourist

Image Captioning Image Captioning Image Captioning A survey of recent deep-learning approaches

Video Captioning Erin Grant March 1 st , 2016 Last Class: Image Captioning From Kiros et al.

CITY OF LANGLEY TOURIST ACCOMMODATION PURPOSE Informing how to do tourist accommodation

Real Time American Sign Language Video Captioning using Deep Neural Networks Syed Tousif Ahmed

Implementing Closed Captioning Implementing Closed Captioning for DTV for DTV Graham Jones

Session Transcript: 6/26/2020 Closed Captioning/ Transcript Disclaimer Closed captioning and/or

Tutorial on Recent Advances in Visual Captioning Luowei Zhou 06/15/2020 1 Outline Problem

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

OTES 12 spots for the Full Day 3

Bright Spots Student & Staff Recognition PTA Reflections Winners! Bright Spots Student

Bright Spots Student & Staff Recognition National Honor Society Induction Bright Spots

Slide 1 / 9 Quantitative Review Slide 2 / 9 1 Brown spots (B) are dominant over no spots (b).

Slide 1 / 9 Slide 2 / 9 1 Brown spots (B) are dominant over no spots (b). If a Bb individual is

National tourist board update Patricia Yates, Director of Strategy & Communications The

Learning Community Meeting Division of Behavioral Health and Recovery April 22, 2015 9:00 a.m.

Address scarcity, NAT, and IPv6 CSCI 466: Networks

Which way forward? AI + vision Larry Zitnick Lead, Facebook AI Research 95% of research is

Computational Logic The (ISO-)Prolog Programming Language 1 (ISO-)Prolog A practical logic

Radical Uncertainty Decision-making for an unknowable future John Kay A measurable

Analogical Transfer: a Form of Similarity-Based Inference? Fadi Badra LIMICS, Paris, France DIG

Mod elisation et m ethodes num eriques pour l etude du transport de particules dans

Growing Sand Piles on a table with side walls. Luigi De Pascale (Pisa, Italy), Chlo Jimenez

Captioning Events in Tourist Spots by Neural Language Generation Mai - PowerPoint PPT Presentation

Captioning Events in Tourist Spots by Neural Language Generation Mai Nguyen 1 , Koichiro Yoshino 1,2 , Yu Suzuki 1,3 , Satoshi Nakamura 1 1 Nara Institute of Science and Technology 2 Japan Science and Technology Agency 3 Gifu University 14 June

The Role &amp; Responsibilities of Tourist The Role &amp; Responsibilities of Tourist The Role

Wave Sound Space Tourist On-board Protection Space Tourist On-board Protection Space Tourist

Image Captioning Image Captioning Image Captioning A survey of recent deep-learning approaches

Video Captioning Erin Grant March 1 st , 2016 Last Class: Image Captioning From Kiros et al.

CITY OF LANGLEY TOURIST ACCOMMODATION PURPOSE Informing how to do tourist accommodation

Real Time American Sign Language Video Captioning using Deep Neural Networks Syed Tousif Ahmed

Implementing Closed Captioning Implementing Closed Captioning for DTV for DTV Graham Jones

Session Transcript: 6/26/2020 Closed Captioning/ Transcript Disclaimer Closed captioning and/or

Tutorial on Recent Advances in Visual Captioning Luowei Zhou 06/15/2020 1 Outline Problem

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

OTES 12 spots for the Full Day 3

Bright Spots Student &amp; Staff Recognition PTA Reflections Winners! Bright Spots Student

Bright Spots Student &amp; Staff Recognition National Honor Society Induction Bright Spots

Slide 1 / 9 Quantitative Review Slide 2 / 9 1 Brown spots (B) are dominant over no spots (b).

Slide 1 / 9 Slide 2 / 9 1 Brown spots (B) are dominant over no spots (b). If a Bb individual is

National tourist board update Patricia Yates, Director of Strategy &amp; Communications The

Learning Community Meeting Division of Behavioral Health and Recovery April 22, 2015 9:00 a.m.

Address scarcity, NAT, and IPv6 CSCI 466: Networks

Which way forward? AI + vision Larry Zitnick Lead, Facebook AI Research 95% of research is

Computational Logic The (ISO-)Prolog Programming Language 1 (ISO-)Prolog A practical logic

Radical Uncertainty Decision-making for an unknowable future John Kay A measurable

Analogical Transfer: a Form of Similarity-Based Inference? Fadi Badra LIMICS, Paris, France DIG

Mod elisation et m ethodes num eriques pour l etude du transport de particules dans

Growing Sand Piles on a table with side walls. Luigi De Pascale (Pisa, Italy), Chlo Jimenez

The Role & Responsibilities of Tourist The Role & Responsibilities of Tourist The Role

Bright Spots Student & Staff Recognition PTA Reflections Winners! Bright Spots Student

Bright Spots Student & Staff Recognition National Honor Society Induction Bright Spots

National tourist board update Patricia Yates, Director of Strategy & Communications The