Captioning Events in Tourist Spots by Neural Language Generation Mai - - PowerPoint PPT Presentation

captioning events in tourist spots by neural language
SMART_READER_LITE
LIVE PREVIEW

Captioning Events in Tourist Spots by Neural Language Generation Mai - - PowerPoint PPT Presentation

Captioning Events in Tourist Spots by Neural Language Generation Mai Nguyen 1 , Koichiro Yoshino 1,2 , Yu Suzuki 1,3 , Satoshi Nakamura 1 1 Nara Institute of Science and Technology 2 Japan Science and Technology Agency 3 Gifu University 14 June


slide-1
SLIDE 1

Captioning Events in Tourist Spots by Neural Language Generation

Mai Nguyen1, Koichiro Yoshino1,2, Yu Suzuki1,3, Satoshi Nakamura1

1Nara Institute of Science and Technology 2Japan Science and Technology Agency 3Gifu University

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 1

slide-2
SLIDE 2

Outline

  • Motivations
  • Related works
  • System Architecture
  • Experiments
  • Conclusion

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 2/23

slide-3
SLIDE 3

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 3/23

Motivations

Natural Language Generation (NLG) will be a good interface to output info. for user

  • User can understand information immediately
  • Giving textual description is one of the easiest ways to present information

Fushimi Inari is very crowded, you should go there in early morning Language Generation Data sources

slide-4
SLIDE 4

Objectives

The task of generating textual descriptions about tourist spots to support user decision making in tourist domain: Important factors: 1. Informativeness 2. Naturalness

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 4/23

slide-5
SLIDE 5

Related works

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 5/23

Pros: Access millions of traveler reviews Cons: Out-of-date information

The task of generating text from data has been investigated in many domains

  • 1. Weather forecast (Belz et al., 2008)
  • 2. Navigation assistance (Dale et al., 2003)
  • 3. Sports (Liang et al., 2009)
  • 4. Market Comments (Murakami et al., 2017)

NLG approaches: Rule-based: Define a set of rules to map frames to NL Pros: simple, error-free, easy to control Cons: time-consuming, poor scalability => Neural-based NLG

slide-6
SLIDE 6

This paper introduces …

  • A system for producing textual description about tourist attractions

ü Integrate a neural language generation ü Summarize a multiple data resources such as infrared sensor, social media, user’s check-in ü coupling with backend-server, we developed a real-time application with up-to-date information.

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 6/23

slide-7
SLIDE 7

NLG Pipeline

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 7/23

Language model Language Generator Sentences

What to say How to say

User’s check in Social media Congestion Semantic representation

slide-8
SLIDE 8

System Architecture

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST

Data sources Backend system Mobile application Pushing notification server

Neural language generator

8/23

How-to-say What-to-say

slide-9
SLIDE 9

What-to-say

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 9/23

  • Backend server: extract and summarize data from several information

sources

  • Infrared sensor, Twitter, user’s check-in
  • Pushing server:
  • Check new-update information from backend with a specific time
  • Create a semantic expression, N-hot vector of frames that have slot values

expression.

  • Send the pushing request to the back-end server
slide-10
SLIDE 10

How-to-say, Neural Language Generation

  • Utilize semantically controlled LSTM,

proposed by [Wen et al, 2015]

  • Core Idea: using a gate mechanism to

control the generated semantics (dialog act/slots)

  • Works well in limited domain, if the

“what to say” is decided

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 10/23

slide-11
SLIDE 11

Experiments

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 11/23

Collect data Build NLG models Integrate NLG in the real- time system

Evaluate generated sentences by automatic metrics

Evaluate usefulness of the system by traveler using the application The target application is a system that describes comprehensive, up-to- date information about tourist attraction by NLG

slide-12
SLIDE 12

Dataset

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 12/23

Attribute Data Type Example Value Data Sources name verbatim string kyoto tower POIS event verbatim string cherry blossom, gion festival

  • fficial websites

state_event dictionary happening, finished

  • fficial websites

crowded dictionary high, low, average infrared sensors recommended boolean yes/no recommendation system

time enumerable now, holidays

popular boolean yes/no users check-in List of possible attributes in the our dataset, data sources and examples

slide-13
SLIDE 13

Data collection

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 13/23

  • 1. Corpus: A pair of meaning representation (MR) and corresponding references

pairs in the tourist domain.

  • Meaning representation (MR): a set of pairs key-value pairs
  • Reference: natural language utterance describing MR

2. Human annotation: workers were recruited via crowd sourcing service. 3. Results: ~ 3300 data points collected in English

MR name=[Fushimi Inari], crowded=[high], time=[festival days], recommended=[no] Reference It is a good idea to avoid Fushimi Inari during festival days because it is extraordinary crowded.

slide-14
SLIDE 14

Pre-process dataset

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 14/23

  • Remove useless instances: which provide not enough information

(Informativeness)

  • Grammar correction: detected by Grammarly and fixed manually, ~ 15%.

(Naturalness)

  • Delexicalization: some properties such as name and event to avoid data sparsity.

"name[maizuru park], event[cherry blossoms], event_state[happening], crowdedness[2]" "maizuru park is not very crowded, the cherry blossoms festival is currently held there."

X-name is not very crowded , the X-event festival is currently held there .

slide-15
SLIDE 15

Evaluation of NLG

  • Corpus is split into training, validation, test set in the ratio 85%:6%:9%
  • Training:
  • character-based version of SC-LSTM (hidden size 1024, batch size 256)
  • Adam optimizer,
  • Dropout,
  • Beam search: beam size = 10
  • Baseline: rule-based
  • Metrics: BLEU, NIST, METEOR, ROUGE-L, CIDEr

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 15/23

Training Validation Test REF 2800 200 296 MR 283 80 83

slide-16
SLIDE 16

Result

  • Automatic metrics

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 16/23

System BLEU NIST METEOR ROUGE-L CIDEr Rule-based 0.41 5.67 0.36 0.67 2.23 Our generator 0.43 6.03 0.37 0.64 2.70 Improvement: ü BLEU: 0.02 ü NIST: 0.36 ü METEOR: 0.01 ü CIDEr: 0.47

slide-17
SLIDE 17

Examples of generated text

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 17/23

  • N. of slots

Example of description generated corresponding to MR 3 name[nanzenji temple], event[autumn leaves], popular[yes] Autumn leaves in nanzenji temple is popular. 4 name[Yoyogi Park], crowded[no], time[now], recommended[no] Yoyogi Park is not crowded right now, but it is not recommended to visit. 5 name[yanagidani kannon], event[autumn leaves], state[happening], crowded[high], recommended[no] Autumn leaves is happening in yanagidani kannon, it is extremely crowded, you should not go there. 5 name[kyoto imperial palace], event[aoi festival], state[happening], crowded[low], recommended[yes] Aoi festival is happening in kyoto imperial palace, it is medium crowded, you should go there.

The NLG model can produce fluent text and would often repeat the same sentence structure multiple times

slide-18
SLIDE 18

Results analysis

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 18/23

  • No. slots

Total of MR Number of correct generated text Percentage 3 18 17 94.4% 4 54 49 90.7% 5 11 5 45.4% Total 83 70 85.54%

Human Evaluation - in term of informativeness

Neural LG have difficulty capturing long-term structure MRs

slide-19
SLIDE 19

Experiment in Real Field

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST

  • Place: Arashiyama area, Kyoto
  • Participants
  • 12 students
  • 19 ordinary people
  • Task descriptions
  • Use the application during their trip
  • Visit at least 3 POIs from 10 AM to

3 PM

  • Answer the questionnaire

19/23

slide-20
SLIDE 20

Application

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 20/23

slide-21
SLIDE 21

Experimental Results

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST

Evaluate usefulness of the pushing function in range 1-5:

  • 5: Very satisfied
  • 4: Rather Satisfied
  • 3: Satisfied
  • 2: Not Satisfied
  • 1: Not Satisfied at all

21/23

Experiment Answers Ratio of satisfied 5 4 3 2 1 Students 1 1 5 1 4 58% Ordinary people 2 1 3 3 1 60 %

slide-22
SLIDE 22

Conclusions

  • Our generation model produces fluent text overall, work smoothly on

real-time application.

  • In term of usefulness, our system could support decision-marking for

tourists.

  • Limitations:
  • Neural LG have difficulty capturing long-term structure because the training

data is quite small

  • The application’s notification was too much for travelers.

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 22/23

slide-23
SLIDE 23

Reference

1. Tsung-Hsien Wen, Milica Gasic, N. M. P. S. D. V. and Young, S.: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems, Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing(EMNLP), pp. 1711-1721 (2015). 2. Sripada, S. G., Reiter, E. and Davy, I.: SUMTIMEMOUSAM: Congurable Marine Weather Forecast Generator (2003) 3. Belz, A.: Automatic Generation of Weather Forecast Texts Using Comprehensive Probabilistic Generation-space Models,

  • Nat. Lang. Eng., Vol. 14, No. 4, pp. 431-455 (online), DOI: 10.1017/S1351324907004664 (2008).

4. Dale, R., Geldof, S. and Prost, J.-P.: CORAL: Using Natural Language Generation for Navigational Assistance, Proceedings of the 26th Australasian Computer Science Conference - Volume 16, ACSC ‘03, Darlinghurst, Australia, Australia, Australian Computer Society, Inc., pp. 35{44 (online), available from (2003) 5. Liang, P., Jordan, M. and Klein, D.: Learning Se-mantic Correspondences with Less Supervision, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, Association for Computational Linguistics, pp. 91-99 (2009).

14 June 2019 2019 @ Mai Nguyen, AHC Lab, NAIST 23/23