What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar - PowerPoint PPT Presentation

Oct 21, 2023 •123 likes •242 views

1 What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh Kai-Wei Chang UCLA AI2 PKU UCLA UCLA A long version, VisualBERT: A Simple and Performant Baseline for Vision and Language is on Arxiv (Aug

1 What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh Kai-Wei Chang UCLA AI2 PKU UCLA UCLA A long version, “VisualBERT: A Simple and Performant Baseline for Vision and Language” is on Arxiv (Aug 2019).
2 BERT with Vision: Pre-trained Vision-and-language (V&L) Models Several people walking on a sidewalk in the rain with umbrellas. a) Yes, it is snowing. Several people [MASK] on a [MASK] b) Yes, [person8] and [person10] are outside. in the [MASK] with [MASK]. c) No, it looks to be fall. d) Yes, it is raining heavily. Pre-train on image captions and transfer to visual question answering
3 BERT with Vision: Pre-trained Vision-and-language (V&L) Models Mask and predict on image captions Transformer over image regions and texts Significant improvement over baselines ViLBERT, B2T2, LXMERT, VisualBERT, Unicoder-VL, VL-BERT, UNITER, … Performance of VisualBERT compared to strong baselines
4 What does BERT with Vision learn during pre-training? Entity grounding Map entities to regions
5 Probing attention maps of VisualBERT: Entity Grounding 50.77 Certain heads can perform entity grounding Accuracy peaks in higher layers
6 What does BERT with Vision learn during pre-training? Syntactic grounding Map w 1 to regions of w 2 , if w 1 w 2
7 Probing attention maps of VisualBERT: Syntactic Grounding For each dependency relationship, there exists at least one accurate syntax grounding head
8 Probing attention maps of VisualBERT: Syntactic Grounding pobj nsubj Syntactic grounding accuracy peaks in higher layers
9 Probing attention maps of VisualBERT: Qualitative Example Layer 3 Layer 4 Layer 5 Layer 6 Layer 10 Layer 11 Woman Sweater Husband Accurate entity and syntax grounding Refined understanding over the layers
10 Discussion Previous work Pre-trained language models learn the classical NLP pipeline (Peters et al., 2018; Liu et al., 2019; Tenney et al., 2019) Qualitatively, V&L models learn some entity grounding (Yang et al., 2016; Anderson et al., 2018; Kim et al., 2018) Grounding can be learned using dedicated methods (Xiao et al., 2017; Datta et al., 2019) Our paper BERT with Vision learns grounding through pre-training We quantitively verify both entity and syntactic grounding https://github.com/uclanlp/visualbert

Recommend

BERT 3.0 The New BERT Wheres Ernie????? Logging into Bert BERT now uses the same style logon as

BERT 3.0 The New BERT Wheres Ernie????? Logging into Bert BERT now uses the same style logon as the other SEIMS+ applications. BERT Main Screen Icons on Main Screen of BERT There are several icons on the main screen for BERT. These icons are

590 views • 16 slides

Collection #1 LOOk 1/8 LOOk 2/8 LOOk 3/8 LOOk 4/8 LOOk 5/8 LOOk 6/8

Collection #1 LOOk 1/8 LOOk 2/8 LOOk 3/8 LOOk 4/8 LOOk 5/8 LOOk 6/8 LOOk 7/8 LOOk 8/8 Born and raised at the Stockholms archipelago, Jannike Sommar is a queer fashion designer that screams ABOUT quietly

118 views • 11 slides

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge Can we trust vision? E.Claridge@cs.bham.ac.uk Studying vision www.cs.bham.ac.uk/~exc Computer vision The eye How does vision work?

298 views • 11 slides

BERT Bidirectional Encoder Representations from Transformers Introduction What is BERT?

BERT Bidirectional Encoder Representations from Transformers Introduction What is BERT? Latest language representational model BERT is conceptually simple and empirically powerful. One of the biggest challenges in natural language

846 views • 20 slides

Control, inference and learning Bert Kappen : SNN Donders Institute, Radboud University, Nijmegen

Control, inference and learning Bert Kappen : SNN Donders Institute, Radboud University, Nijmegen Gatsby Unit, UCL London July 21, 2015 Bert Kappen Why control theory? A theory for intelligent behaviour: - neuroscience Bert Kappen Oxford

791 views • 59 slides

BERT Basic Error Response Type Bert Why: Document WG Choice What: method to sign

BERT Basic Error Response Type Bert Why: Document WG Choice What: method to sign responses on line Pros ... Simplifies negative wild card responses Fairly simple signing model model Satisfies universal signing requirement

623 views • 5 slides

Architecture in Motion How Adyen achieved 100x Bert Wolters - EVP Technology bert@adyen.com

Architecture in Motion How Adyen achieved 100x Bert Wolters - EVP Technology bert@adyen.com Traditional vs Today Customers are in full control $1B 80% On Singles Day in China, $1Billion was On Black Friday in the U.S., nearly 80% of

1.45k views • 46 slides

PHOTONICS IN THE MAGIC KINGDOM FAIRY TALES AND TALENT FAIRS BERT GYSELINCKX IMEC USA BERT

5/29/18 PHOTONICS IN THE MAGIC KINGDOM FAIRY TALES AND TALENT FAIRS BERT GYSELINCKX IMEC USA BERT GYSELINCKX 92 93 01 02 05 13 14 '16 Incorporate MSEE Imec Sabbatical Imec Imec Sabbatical Imec Imec

478 views • 23 slides

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION Sauvignon Blanc French Colombard VISION Cabernet Sauvignon VISION Merlot VISION Shiraz VISION Malbec vision special bland Chardonnay-

365 views • 35 slides

2017 Humana Vision 130 LOOK Whats NEW! NEW RETAIL FRAME BENEFIT 2 Humana Vision 100

School Board of Volusia County Humana Vision Plan 2017 Humana Vision 130 LOOK Whats NEW! NEW RETAIL FRAME BENEFIT 2 Humana Vision 100 NEW 3 Humana Vision 130 NEW Humana Vision 130 NEW ID CARD ACCESS Humana Vision 130 NEW

694 views • 10 slides

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the cell? Does

352 views • 33 slides

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not equal five!!! One does not equal five!!! One does not equal five!!! One does not equal five!!! One does not equal five!!! One does not equal

1.14k views • 87 slides

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007 February 2, 2007 Who Can Provide Vision Who Can Provide Vision Services? Services? A licensed ophthalmologist, A licensed ophthalmologist,

596 views • 23 slides

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

Fresh Expressions vision event Vision Our National Church partners .. Vision Our National Network partners Vision Getting and sharing vision From God to the world Own Listen Get vision sifts vision voice of God Share

649 views • 10 slides

How does the power industry support How does the power industry support How does the power

How does the power industry support How does the power industry support How does the power industry support How does the power industry support How does the power industry support How does the power industry support How does the power industry

405 views • 25 slides

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our Vision Environmental Scan Realizing Our Vision Physician Practice EHR Adoption Realizing Our Vision Big Data Estimated 80% of data is

363 views • 33 slides

University/Research Institute-Industry Linkages in Two Chinese Cities: Commercializing

University/Research Institute-Industry Linkages in Two Chinese Cities: Commercializing Technological Innovation Kun Chen Martin Kenney Graduate Student Professor Dept. of Anthropology

267 views • 14 slides

M etal-Insulator Transitions in a model for magnetic Weyl semimetal and graphite under high

M etal-Insulator Transitions in a model for magnetic Weyl semimetal and graphite under high magnetic field Disorder-driven quantum phase transition in Weyl fermion semimetal Luo, Xu, Ohtsuki and RS, ArXiv:1710.00572v2, Liu, Ohtsuki and RS,

875 views • 38 slides

Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem

FF-Bond: Multi-bit Flip-flop Bonding at Placement C HANG -C HENG T SAI Y IYU S HI G UOJIE L UO I RIS H UI -R U J IANG IRIS Lab NCTU MST PKU ISPD-13 Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation

1.04k views • 46 slides

Interpreting Adversarial Trained Convolutional Neural Networks Tianyuan Zhang , Zhanxing Zhu

Interpreting Adversarial Trained Convolutional Neural Networks Tianyuan Zhang , Zhanxing Zhu Peking University 1600012888@pku.edu.cn zhanxing.zhu@pku.edu.cn Poster: Pacific Ballroom #148 1 Contents Normally trained CNNs typically

670 views • 20 slides

Chuan Liu Institute of Theoretical Physics School of Physics, Peking University Basic

Chuan Liu Institute of Theoretical Physics School of Physics, Peking University Basic information Universities that would like to hire lattice faculty members List of corresponding persons How to apply How to survive

210 views • 9 slides

Two Secure Anonymous Proxy-based Data Storages * Olivier Blazy 1 Xavier Bultel 2 Pascal Lafourcade

Two Secure Anonymous Proxy-based Data Storages * Olivier Blazy 1 Xavier Bultel 2 Pascal Lafourcade 2 Universit de Limoges, Xlim, Limoges, France Clermont Universit Auvergne, LIMOS, Clermont-Ferrand, France July 29, 2016 SECRYPT 2016, Lisbon

556 views • 51 slides

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer Security Day 6: Low-level defenses and BCECHO counterattacks, part 2 Control-flow integrity (CFI) Stephen McCamant University of Minnesota, Computer

349 views • 8 slides

Developer Fluency: Achieving True Mastery in Software Projects Minghui Zhou, Audris

Developer Fluency: Achieving True Mastery in Software Projects Minghui Zhou, Audris Mockus zhmh@pku.edu.cn, audris@avaya.com Peking University, Avaya Research Labs, Beijing, China NJ, USA Agenda History

515 views • 23 slides

What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar - PowerPoint PPT Presentation

1 What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh Kai-Wei Chang UCLA AI2 PKU UCLA UCLA A long version, VisualBERT: A Simple and Performant Baseline for Vision and Language is on Arxiv (Aug

BERT 3.0 The New BERT Wheres Ernie????? Logging into Bert BERT now uses the same style logon as

Collection #1 LOOk 1/8 LOOk 2/8 LOOk 3/8 LOOk 4/8 LOOk 5/8 LOOk 6/8

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

BERT Bidirectional Encoder Representations from Transformers Introduction What is BERT?

Control, inference and learning Bert Kappen : SNN Donders Institute, Radboud University, Nijmegen

BERT Basic Error Response Type Bert Why: Document WG Choice What: method to sign

Architecture in Motion How Adyen achieved 100x Bert Wolters - EVP Technology bert@adyen.com

PHOTONICS IN THE MAGIC KINGDOM FAIRY TALES AND TALENT FAIRS BERT GYSELINCKX IMEC USA BERT

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

2017 Humana Vision 130 LOOK Whats NEW! NEW RETAIL FRAME BENEFIT 2 Humana Vision 100

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

How does the power industry support How does the power industry support How does the power

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

University/Research Institute-Industry Linkages in Two Chinese Cities: Commercializing

M etal-Insulator Transitions in a model for magnetic Weyl semimetal and graphite under high

Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem

Interpreting Adversarial Trained Convolutional Neural Networks Tianyuan Zhang , Zhanxing Zhu

Chuan Liu Institute of Theoretical Physics School of Physics, Peking University Basic

Two Secure Anonymous Proxy-based Data Storages * Olivier Blazy 1 Xavier Bultel 2 Pascal Lafourcade

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer

Developer Fluency: Achieving True Mastery in Software Projects Minghui Zhou, Audris

Sambuz

Useful Links

Newsletter

Mail Us

What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar - PowerPoint PPT Presentation

1 What Does BERT with Vision Look At? Liunian Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh Kai-Wei Chang UCLA AI2 PKU UCLA UCLA A long version, VisualBERT: A Simple and Performant Baseline for Vision and Language is on Arxiv (Aug

BERT 3.0 The New BERT Wheres Ernie????? Logging into Bert BERT now uses the same style logon as

Collection #1 LOOk 1/8 LOOk 2/8 LOOk 3/8 LOOk 4/8 LOOk 5/8 LOOk 6/8

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

BERT Bidirectional Encoder Representations from Transformers Introduction What is BERT?

Control, inference and learning Bert Kappen : SNN Donders Institute, Radboud University, Nijmegen

BERT Basic Error Response Type Bert Why: Document WG Choice What: method to sign

Architecture in Motion How Adyen achieved 100x Bert Wolters - EVP Technology bert@adyen.com

PHOTONICS IN THE MAGIC KINGDOM FAIRY TALES AND TALENT FAIRS BERT GYSELINCKX IMEC USA BERT

Branding Presentation VISION Mevushal VISION Muscat of Alexandria &amp; Viognier VISION

2017 Humana Vision 130 LOOK Whats NEW! NEW RETAIL FRAME BENEFIT 2 Humana Vision 100

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not

Vision Services Vision Services &amp; &amp; Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

How does the power industry support How does the power industry support How does the power

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

University/Research Institute-Industry Linkages in Two Chinese Cities: Commercializing

M etal-Insulator Transitions in a model for magnetic Weyl semimetal and graphite under high

Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem

Interpreting Adversarial Trained Convolutional Neural Networks Tianyuan Zhang , Zhanxing Zhu

Chuan Liu Institute of Theoretical Physics School of Physics, Peking University Basic

Two Secure Anonymous Proxy-based Data Storages * Olivier Blazy 1 Xavier Bultel 2 Pascal Lafourcade

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer

Developer Fluency: Achieving True Mastery in Software Projects Minghui Zhou, Audris

Sambuz

Useful Links

Newsletter

Mail Us

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007