Yash Goyal Aishwarya Agrawal (Georgia Tech) (Georgia Tech)

Outline Overview of Task and Dataset Overview of Challenge Winner Announcements Analysis of Results 2

VQA Task 7

VQA Task What is the mustache made of? 8

VQA Task AI System What is the mustache made of? 9

VQA Task AI System bananas What is the mustache made of? 10

VQA v1.0 Dataset 11

VQA v1.0 Dataset About objects 12

VQA v1.0 Dataset Fine-grained recognition 13

VQA v1.0 Dataset Counting 14

VQA v1.0 Dataset Common sense 15

VQA v2.0 Dataset

Who is wearing glasses? Similar images man woman Different answers New in VQA v2.0 VQA v1.0

VQA v2.0 Dataset Stats • >200K images • >1.1M questions • >11M answers 1.8 x VQA v1.0 18

Accuracy Metric 19

VQA Challenge on https://evalai.cloudcv.org/ 21

Dataset splits Images Questions Answers Training 80K 443K 4.4M Dataset size is approximate 22

Dataset splits Images Questions Answers Training 80K 443K 4.4M Validation 40K 214K 2.1M Dataset size is approximate 23

Dataset splits Images Questions Answers Training 80K 443K 4.4M Validation 40K 214K 2.1M Test 80K 447K Dataset size is approximate 24

Test Dataset • 4 splits of approximately equal size • Test-dev (development) – Debugging and Validation. • Test-standard (publications) – Used to score entries for the Public Leaderboard. • Test-challenge (competitions) – Used to rank challenge participants. • Test-reserve (check overfitting) – Used to estimate overfitting. Scores on this set are never released. Slide adapted from: MSCOCO Detection/Segmentation Challenge, ICCV 2015 25

Outline Overview of Task and Dataset Overview of Challenge Winner Announcements Analysis of Results

Challenge Stats • 40 teams • >=40 institutions* • >=8 countries* *Statistics based on teams that have replied

Challenge Runner-Ups Joint Runner-Up Team 1 SNU-BI Jin-Hwa Kim (Seoul National University) Jaehyun Jun (Seoul National University) Byoung-Tak Zhang (Seoul National University & Surromind Robotics) Challenge Accuracy : 71.69 28

Challenge Runner-Ups Joint Runner-Up Team 2 HDU-UCAS-USYD Zhou Yu ( Hangzhou Dianzi University, China ) Jun Yu ( Hangzhou Dianzi University, China ) Chenchao Xiang ( Hangzhou Dianzi University, China ) Liang Wang ( Hangzhou Dianzi University, China ) Dalu Guo ( The Unversity of Sydney, Australia ) Qingming Huang ( University of Chinese Academy of Sciences ) Jianping Fan ( Hangzhou Dianzi University, China ) Dacheng Tao ( The University of Sydney, Australia ) Challenge Accuracy : 71.91

Challenge Winner FAIR-A* Yu Jiang† (Facebook AI Research) Vivek Natarajan† (Facebook AI Research) Xinlei Chen† (Facebook AI Research) Marcus Rohrbach (Facebook AI Research) Dhruv Batra (Facebook AI Research & Georgia Tech) Devi Parikh (Facebook AI Research & Georgia Tech) Challenge Accuracy : 72.41 † equal contribution 30

Outline Overview of Task and Dataset Overview of Challenge Winner Announcements Analysis of Results

Challenge Results 74 72 70 68 66 64 62 60

Challenge Results 73 72 71 70 69 68 67

Challenge Results 73 72 71 +3.4% absolute 70 69 68 67

Statistical Significance • Bootstrap samples 5000 times • @ 95% confidence

Statistical Significance 73 72 Overall Accuracy 71 70 69 68 67

Easy vs. Difficult Questions

Easy vs. Difficult Questions 70 60 correctly answered by teams Percentage of questions 50 40 30 20 10 0 0/10 1/10 2/10 3/10 4/10 5/10 6/10 7/10 8/10 9/10 10/10 Number of top 10 teams

Easy vs. Difficult Questions 70 60 correctly answered by teams Percentage of questions 50 40 82.5% of questions can be answered by at least 1 method! 30 Difficult Questions 20 10 0 0/10 1/10 2/10 3/10 4/10 5/10 6/10 7/10 8/10 9/10 10/10 Number of top 10 teams

Easy vs. Difficult Questions 70 Easy Questions 60 correctly answered by teams Percentage of questions 50 40 30 Difficult Questions 20 10 0 0/10 1/10 2/10 3/10 4/10 5/10 6/10 7/10 8/10 9/10 10/10 Number of top 10 teams

Easy vs. Difficult Questions 70 60 correctly answered by teams Percentage of questions 50 40 30 20 10 0 0/10 1/10 2/10 3/10 4/10 5/10 6/10 7/10 8/10 9/10 10/10 Number of top 10 teams 2016 2017 2018

Difficult Questions with Rare Answers

Difficult Questions with Rare Answers What is the name of … What is the number on … What is written on the … What does the sign … What time is it? What kind of … What type of … Why is the …

Easy vs. Difficult Questions

Easy vs. Difficult Questions Difficult Questions Easy Questions with Frequent Answers

Answer Type Analyses • SNU_BI performs the best for “number” questions

"number" accuracy 30 35 40 45 50 55 60 FAIR-A* HDU-UCAS-USYD SNU-BI casia_iva Results on “number” questions Tohoku CV Lab MIL-UT ut-swk graph-attention-msm DCD_ZJU vqabyte fs UTS_YZZD Adelaide-Teney VQA-ReasonTensor UPMC-LIP6 wyvernbai caption_vqa cvqa nagizero CFM-UESTC VQA_NTU yudf2010 nmlab612 TsinghuaCVLab CIST-VQA VLC Southampton RelVQA University of Guelph MLRG NTU_ROSE_USTC zhi-smile VQA-Machine+ xie Vardaan HACKERS AE-VQA dandelin ghost VQA-Learning vqa-suchow HAIBIN windLBL VQA_San vqateam_mcb_benchmark akshay_isical

Answer Type Analyses • SNU_BI performs the best for “number” questions • No team statistically significantly better than the winner team for “yes/no” and “other”

Are models sensitive to subtle changes in images? Who is wearing glasses? Similar images man woman Different answers

Are models sensitive to subtle changes in images? • Are predictions different for complementary images? • Are predictions accurate for complementary images?

40 45 50 55 60 65 70 FAIR-A* HDU-UCAS-USYD SNU-BI casia_iva MIL-UT Tohoku CV Lab ut-swk graph-attention-msm Are predictions different for DCD_ZJU complementary images? vqabyte fs UTS_YZZD Adelaide-Teney VQA-ReasonTensor UPMC-LIP6 wyvernbai caption_vqa cvqa nagizero CFM-UESTC VQA_NTU yudf2010 nmlab612 TsinghuaCVLab CIST-VQA VLC Southampton RelVQA University of Guelph MLRG NTU_ROSE_USTC zhi-smile VQA-Machine+ xie Vardaan HACKERS AE-VQA dandelin ghost VQA-Learning vqa-suchow HAIBIN windLBL VQA_San vqateam_mcb_benchmark akshay_isical

40 42 44 46 48 50 52 54 56 58 60 FAIR-A* HDU-UCAS-USYD SNU-BI casia_iva MIL-UT Tohoku CV Lab ut-swk Are predictions accurate for graph-attention-msm DCD_ZJU complementary images? vqabyte fs UTS_YZZD Adelaide-Teney VQA-ReasonTensor UPMC-LIP6 wyvernbai caption_vqa cvqa nagizero CFM-UESTC VQA_NTU yudf2010 nmlab612 TsinghuaCVLab CIST-VQA VLC Southampton RelVQA University of Guelph MLRG NTU_ROSE_USTC zhi-smile VQA-Machine+ xie Vardaan HACKERS AE-VQA dandelin ghost VQA-Learning vqa-suchow HAIBIN windLBL VQA_San vqateam_mcb_benchmark akshay_isical

40 42 44 46 48 50 52 54 56 58 60 FAIR-A* HDU-UCAS-USYD SNU-BI casia_iva MIL-UT Tohoku CV Lab ut-swk Are predictions accurate for graph-attention-msm DCD_ZJU complementary images? vqabyte fs UTS_YZZD Adelaide-Teney VQA-ReasonTensor UPMC-LIP6 wyvernbai caption_vqa cvqa nagizero CFM-UESTC VQA_NTU yudf2010 nmlab612 TsinghuaCVLab CIST-VQA +4.8% absolute VLC Southampton RelVQA University of Guelph MLRG NTU_ROSE_USTC zhi-smile VQA-Machine+ xie Vardaan HACKERS AE-VQA dandelin ghost VQA-Learning vqa-suchow 2017 winner HAIBIN windLBL 52.7% VQA_San vqateam_mcb_benchmark akshay_isical

Are models driven by priors? Only consider those questions whose answers are not popular (given the question type) in training • 1-Prior: Test answers are not the top-1 most common in training • 2-Prior: Test answer are not the top-2 most common in training Agrawal et al., CVPR 2018

Yash Goyal Aishwarya Agrawal (Georgia Tech) (Georgia Tech) - PowerPoint PPT Presentation

Yash Goyal Aishwarya Agrawal (Georgia Tech) (Georgia Tech) Outline Overview of Task and Dataset Overview of Challenge Winner Announcements Analysis of Results 2 Outline Overview of Task and Dataset Overview of Challenge Winner

Counterfactual Visual Explanations Yash Goyal Ziyan Wu Jan Ernst Dhruv Batra Devi Parikh

Slide Credits:Agrawal Slide Credits:Agrawal Slide Credits:Agrawal Kolmogorov-Smirnov Test

VQA: Visual Question Answering Stainslaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell,

Vinay Goyal July 24, 2018 Introduction: Vinay Goyal Working as Product S tewardship Proj ect

District 8 Middle Georgia Heart of Georgia Altamaha Southern Georgia Southwest Georgia Middle

made easy by COST MANAGEMENT by SUBHASH C. AGRAWAL FCMA, FCS, FICA, LL.B. SUBHASH AGRAWAL

Embodied Question Answering NVIDIA GTC March 26, 2018 Abhishek Das PhD student, Georgia Tech

Quality and food safety management in a distributed processing environment Paras Goyal Paras

Effect of Number of Drop Effect of Number of Drop Precedences in Assured in Assured Precedences

Stability Issues for Georgia Why Georgia Matters for Europe? Not only Georgia is a European

Ground-Truth Driven Cyber Security Research: Some Examples Mustaque Ahamad, Georgia Tech, NYU Abu

The Topology of Configuration Spaces of Coverings Shuchi Agrawal, Daniel Barg, Derek Levinson

Privacy Cognizant Information Privacy Cognizant Information Systems Systems Rakesh Agrawal

Panel Sustainability and New Energy Systems (August 21, 8:30-9:15) Moderator: Rakesh Agrawal

Algrabraic Complexity Theory Manindra Agrawal IIT Kanpur Symposium on Learning, Algorithms and

Synchronization of foodborne disease seasonality Elena N. Naumova, Ryan Simpson Aishwarya

Rethinking CO 2 : how can we put it to use Richard Howard, Head of Environment &

European Zero Emission Technology and Innovation Platform Introduction- Classifying CCU and CCS

Main Effects vs. Simple Effects Scott Fraundorf MLM Reading Group April 7th, 2011 If you want

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern

0123%&21 4356$1 7"8"2$ ./9 :;:; <)' =11$>&8 !"#$"%#& !

Preexisting Spatial Biases Influence the Encoding of Information into Visual Working Memory

Commissioning and Operation of the New CMS Phase-1 Pixel Detector Weinan Si University of

Public Transit Backbone Prepared by Robert B. Case, PhD, PE For HRT-Participating Cities Transit

Sambuz

Useful Links

Newsletter

Mail Us