No Metrics Are Perfect: Adversarial REward Learning for Visual - PowerPoint PPT Presentation

No Metrics Are Perfect: Adversarial REward Learning for Visual Storytelling Xin Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1

Image Captioning Caption: Two young kids with backpacks sitting on the porch. 2

Visual Storytelling Story: Story: Story: The brother did not want to talk to his sister. The siblings made up. The brother did not want to talk to his sister. The siblings made up. The brother did not want to talk to his sister. The siblings made up. They started to talk and smile. Their parents showed up. They were They started to talk and smile. Their parents showed up. They were They started to talk and smile. Their parents showed up. They were happy to see them. happy to see them. happy to see them. Imagination Emotion Subjectiveness 3

Visual Storytelling Story #2: The brother and sister were ready for the first day of school. They were excited to go to their first day and meet new friends. They told their mom how happy they were. They said they were going to make a lot of new friends. Then they got up and got ready to get in the car. 4

Behavioral cloning methods ( e.g. MLE) are not good enough for visual storytelling 5

Reinforcement Learning o Directly optimize the existing metrics BLEU, METEOR, ROUGE, CIDEr • Reduce exposure bias • Environment Reward Reinforcement Optimal Function Learning (RL) Policy Rennie 2017, “Self-critical Sequence Training for Image Captioning” 6

We had a great time to have a lot of the. They were to be a of the. They were to be in the. The and it were to be the. The, and it were to be the. Average METEOR score: 40.2 (SOTA model: 35.0) 7

I had a great time at the restaurant today. The food was delicious. I had a lot of food. I had a great time. BLEU-4 score: 0 8

No Metrics Are Perfect! 9

erse Reinforcement Learning In Inver Reinforcement Learning Environment Environment Reward Reward Inverse Reinforcement Reinforcement Optimal Optimal Function Function Learning (IRL) Learning (RL) Policy Policy 10

Adversarial REward Learning (AREL) Environment Reference Story Generated Adversarial Reward Model Policy Model Story Objective Inverse RL Reward RL 11

Policy Model 𝜌 " My brother recently graduated college. It was a formal cap and gown event. CNN My mom and dad attended. Later, my aunt and grandma showed up. When the event was over he even got congratulated by the mascot. Encoder Decoder 12

Reward Model 𝑆 $ CNN + my mom and Reward dad attended . <EOS> Story FC layer Pooling Convolution Kim 2014, “Convolutional Neural Networks for Sentence Classification” 13

Associating Reward with Story Energy-based models associate an energy value 𝐹 $ (𝑦) with a sample 𝑦 , modeling the data as a Boltzmann distribution 𝑞 $ 𝑦 = exp (−𝐹 $ (𝑦)) 𝑎 Reward Boltzmann Distribution Reward Function Story Partition function Approximate data distribution ∗ (𝑋) is achieved when Optimal reward function 𝑆 $ LeCun et al. 2006, “A tutorial on energy-based learning”

AREL Objective Therefore, we define an adversarial objective with KL-divergence Reward Boltzmann distribution Empirical distribution Policy distribution • The objective of Reward Model 𝑆 $ : • The objective of Policy Model 𝜌 " : 15

Reward Visualization 16

Automatic Evaluation Method Method Method Method Method BLEU-1 BLEU-1 BLEU-1 BLEU-1 BLEU-1 BLEU-2 BLEU-2 BLEU-2 BLEU-2 BLEU-2 BLEU-3 BLEU-3 BLEU-3 BLEU-3 BLEU-3 BLEU-4 BLEU-4 BLEU-4 BLEU-4 BLEU-4 METEOR METEOR METEOR METEOR METEOR ROUGE ROUGE ROUGE ROUGE ROUGE CIDEr CIDEr CIDEr CIDEr CIDEr Seq2seq (Huang et al.) Seq2seq (Huang et al.) Seq2seq (Huang et al.) Seq2seq (Huang et al.) Seq2seq (Huang et al.) - - - - - - - - - - - - - - - - - - - - 31.4 31.4 31.4 31.4 31.4 - - - - - - - - - - HierAttRNN (Yu et al.) HierAttRNN (Yu et al.) HierAttRNN (Yu et al.) HierAttRNN (Yu et al.) HierAttRNN (Yu et al.) - - - - - - - - - - 21.0 21.0 21.0 21.0 21.0 - - - - - 34.1 34.1 34.1 34.1 34.1 29.5 29.5 29.5 29.5 29.5 7.5 7.5 7.5 7.5 7.5 AREL (ours) XE XE XE XE 63.7 62.3 62.3 62.3 62.3 38.2 39.0 38.2 38.2 38.2 23.1 22.5 22.5 22.5 22.5 14.0 13.7 13.7 13.7 13.7 35.0 34.8 34.8 34.8 34.8 29.7 29.7 29.6 29.7 29.7 9.5 8.7 8.7 8.7 8.7 BLEU-RL BLEU-RL AREL (ours) BLEU-RL 62.1 63.7 62.1 62.1 38.0 39.0 38.0 38.0 23.1 22.6 22.6 22.6 13.9 13.9 14.0 13.9 34.6 34.6 35.0 34.6 29.6 29.0 29.0 29.0 8.9 9.5 8.9 8.9 METEOR-RL METEOR-RL METEOR-RL 68.1 68.1 68.1 35.0 35.0 35.0 15.4 15.4 15.4 6.8 6.8 6.8 40.2 40.2 40.2 30.0 30.0 30.0 1.2 1.2 1.2 ROUGE-RL ROUGE-RL ROUGE-RL 58.1 58.1 58.1 18.5 18.5 18.5 1.6 1.6 1.6 0.0 0.0 0.0 27.0 27.0 27.0 33.8 33.8 33.8 0.0 0.0 0.0 CIDEr-RL CIDEr-RL CIDEr-RL 61.9 61.9 61.9 37.8 37.8 37.8 22.5 22.5 22.5 13.8 13.8 13.8 34.9 34.9 34.9 29.7 29.7 29.7 8.1 8.1 8.1 GAN AREL (ours) AREL (ours) 62.8 63.7 63.7 38.8 39.0 39.0 23.0 23.1 23.1 14.0 14.0 14.0 35.0 35.0 35.0 29.5 29.6 29.6 9.5 9.0 9.5 AREL (ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5 Huang et al. 2016, “Visual Storytelling” Yu et al. 2017, “Hierarchically-Attentive RNN for Album Summarization and Storytelling” 17

Human Evaluation Turing Test 50% -6.3 40% -13.7 -17.5 -26.1 30% 20% 10% 0% XE BLEU-RL CIDEr-RL GAN AREL Win Unsure 18

Human Evaluation Pairwise Comparison Relevance : the story accurately describes what is happening in the photo stream and covers the main objects. Expressiveness : coherence, grammatically and semantically correct, no repetition, expressive language style. Concreteness : the story should narrate concretely what is in the images rather than giving very general descriptions. 19

There were many There were many There were many We took a trip to We took a trip to We took a trip to We had a great We had a great We had a great He was a great He was a great He was a great It was a beautiful It was a beautiful It was a beautiful different kinds of different kinds of different kinds of XE-ss XE-ss XE-ss the mountains. the mountains. the mountains. time. time. time. time. time. time. day. day. day. different kinds. different kinds. different kinds. At the end of the At the end of the There were so There were so The family decided The family decided day, we were able day, we were able many different many different The family decided The family decided to take a trip to to take a trip to AREL AREL I had a great time. I had a great time. to take a picture of to take a picture of to go on a hike. to go on a hike. kinds of things to kinds of things to the countryside. the countryside. the beautiful the beautiful see. see. scenery. scenery. Human- There were a lot of We drank a lot of We went on a hike The view was created strange plants I had a great time. water while we yesterday. spectacular. Story there. were hiking. 20

Takeaway o Generating and evaluating stories are both challenging due to the complicated nature of stories o No existing metrics are perfect for either training or testing o AREL is a better learning framework for visual storytelling Can be applied to other generation tasks § o Our approach is model-agnostic Advanced models à better performance § 21

Thanks! Paper: https://arxiv.org/abs/1804.09160 Code: https://github.com/littlekobe/AREL 22

No Metrics Are Perfect: Adversarial REward Learning for Visual - PowerPoint PPT Presentation

No Metrics Are Perfect: Adversarial REward Learning for Visual Storytelling Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Wang 1 Image Captioning Caption: Two young kids with backpacks sitting on the porch. 2 Visual Storytelling Story:

The ULTIMATE Business Incentive Company REWARD YOUR CUSTOMERS; REWARD YOUR EMPLOYEES REWARD YOUR

Risk/Reward Risk/Reward If you buy here, what is the target? What is the risk? 1 221

No Metrics Are Perfect: Adversarial REward Learning for Visual Storytelling Xin (Eric) Wang*,

Adversarial Reward Learning for Visual Storytelling Xin Wang, Wenhu Chen, Yuan-Fang Wang, William

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Reward Shaping in Episodic Reinforcement Learning Marek Grze s Canterbury, UK AAMAS 2017

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Where's The Reward? Where's The Reward? A Review of Reinforcement Learning for Instructional

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

The Perfect Sales Presentation Shook Robert L The Perfect Sales Presentation Shook Robert L The

Perfect for a New England Vacation! Perfect for a New England Vacation! Sarah Santiago

Welcome from Smokey Daniels Helping ALL teachers create safe, engaging, and collaborative

Custom Writing Service - Special Prices Critical thinking ppt presentation characteristics

The Coming Regulation of Nanotechnology: Transnational Models ICNT San Francisco November 2,

Software Clone Detection State-of-the-Art Survey Rainer Koschke University of Bremen, Germany

CATHOLIC FOUNDATION CHANGES JUNE 25, 2019 SAINT JOHN XXII PARISH 1 MEETING AGENDA Opening

Make a change, Go Realtime XD PRODUCTIONS - 145 Rue J-J Rousseau, 92130 Issy Les Moulineaux,

Ethics In The Practice of Petroleum Engineering Denver Chapter of SPEE December 12, 2012 Mr.

Early Learning Program Characteristics and Child Outcomes: Lessons from Tennessee Dale C.

Sambuz

Useful Links

Newsletter

Mail Us