Experiment presentation by Nayan Singhal Motivation Human - - PowerPoint PPT Presentation

▶

Jul 09, 2023 377 likes •880 views

Situation Recognition: Visual Semantic Role Labelling for Image Understanding Mark Yatskar, Luke Zettlemoyer, Ali Farhadi Experiment presentation by Nayan Singhal Motivation Human understanding of image Verbs in English language.

SLIDE 1

Situation Recognition: Visual Semantic Role Labelling for Image Understanding

Mark Yatskar, Luke Zettlemoyer, Ali Farhadi

Experiment presentation by Nayan Singhal

SLIDE 2

Motivation

Human understanding of image
Verbs in English language.

SLIDE 3

Approach

CRF with CNN
Log linear Loss

CNN CRF

1024

CARRYING AGENT WOMAN ITEM

JAR

AGENTPART HEAD PLACE OUTDOOR

SLIDE 4

How object plays role in image understanding?

SLIDE 5

neighboring images

SLIDE 6

Remove Cliff

neighboring images Removing Cliff

SLIDE 7

Remove person

neighboring images Removing Man

SLIDE 8

Remove Sky

neighboring images Removing Sky

SLIDE 9

Image (2)

neighboring images

SLIDE 10

Remove Person

neighboring images Removing man

SLIDE 11

Remove Background

neighboring images Removing Sky and Man

SLIDE 12

Conclusion

Each object plays a significant role in image understanding.

SLIDE 13

Experiment

1) Analyzing Failure Cases 2) Different moods of faces

SLIDE 14

Expt 1: Analyzing Failure Cases

SLIDE 15

Object Recognition (1)

Imsitu Result

SLIDE 16

Object Recognition (2)

Imsitu Result

SLIDE 17

Object Recognition (3)

Imsitu Result

SLIDE 18

Object Recognition (4)

Imsitu Result

SLIDE 19

Object Recognition (5)

Imsitu Result

SLIDE 20

Object Recognition (6)

Imsitu Result

SLIDE 21

Why is it happening? Are these images difficult to categorize?

SLIDE 22

Let’s analyze these with ImageNet

SLIDE 23

Object Recognition (1)

Imsitu Result

SLIDE 24

ImageNet classification

SLIDE 25

Object Recognition (2)

Imsitu Result

SLIDE 26

Object Recognition (2)

ImageNet classification

SLIDE 27

Object Recognition (3)

Imsitu Result

SLIDE 28

Object Recognition (3)

ImageNet classification

SLIDE 29

Object Recognition (4)

Imsitu Result

SLIDE 30

Object Recognition (4)

ImageNet classification

SLIDE 31

Object Recognition (5)

Imsitu Result

SLIDE 32

Object Recognition (5)

ImageNet classification

SLIDE 33

Object Recognition (6)

Imsitu Result

SLIDE 34

Object Recognition (6)

ImageNet classification

SLIDE 35

Object Recognition (1)

Verb Role Noun Potential Labels

A (Verb Role Noun Potential) + B (Labels)

Post Processing: Slot Noun

SLIDE 36

Object Recognition (1)

Imsitu Result

SLIDE 37

Object Recognition (1)

VGG Verb Potential Verb Role Noun Potential Imagenet Labels

Preprocessing

SLIDE 38

Future Work

Add labels in preprocessing.

SLIDE 39

Exp 2: Different moods

Laughing
Smiling
Frowning
Grimacing
Winking
Squinting
Shouting
Puckering

Laughing Smiling Frowning Puckering Squinting Winking

SLIDE 40

Success Case

Smiling Agent place man

0.35967

Laughing Agent place man

0.35777

Shouting Agent place man

0.37531

SLIDE 41

Success Case

Frowning Agent place man

0.24378

Grimacing Agent place man

0.21052

SLIDE 42

Failure Case

Winking Agent place man

0.20954

Puckering Agent place woman

0.21052

SLIDE 43

Test Images (25)

Conclusion: Detect different moods of faces with slight

variation.

SLIDE 44

Some Interest Categorization

SLIDE 45

Some Interest Categorization

Camouflaging Agent frog Hiding Item pebble Place

Camouflaging

Agent

Hiding Item tree Place

utdoors

SLIDE 46

neighboring images

SLIDE 47

neighboring images

SLIDE 48

Thank You

SLIDE 49

No Agent(2)

Watering Agent Person Tool Bucket Place garden Shredding Agent Person Tool Shreder Item paper Place