Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval - - PowerPoint PPT Presentation

▶

Apr 26, 2023 371 likes •438 views

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning Shizhe Chen 1 , Yida Zhao 1 , Qin Jin 1 , Qi Wu 2 1 Renmin University of China , 2 University of Adelaide

SLIDE 1

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning

Shizhe Chen1, Yida Zhao1, Qin Jin1, Qi Wu2

1Renmin University of China, 2University of Adelaide

SLIDE 2

Vi Video-Te Text Cr Cros

ss-mod

modal Re Retrieval

Dominant approach: learning joint embedding space
Global visual-semantic matching
L One vector is hard to encode fine-grained details
Local visual-semantic matching
L Relationships between local vectors are not well captured via sequential modeling

SLIDE 3

Hier Hierar archic hical al Gr Grap aph Re Reasoning Mod Model (H (HGR)

Hierarchical Textual Encoding
Decompose sentence into semantic

role graph

Capture relationships via graph

reasoning

Multi-level Video-Text Matching
Event
Actions
Entities
Hierarchical Video Encoding
Guided by different levels of text to learn

diverse video representations

Global Local

SLIDE 4

In-domain Cross-modal Retrieval
Better performance across three datasets
Cross-domain Generalization
Generalize better across datasets
Fine-grained Binary Selection
Differentiate fine-grained difference

between positive and negative sentences

Expe Experiments

SLIDE 5

Con Conclusion

n
Decompose videos and texts into hierarchical semantic levels
Utilize graph reasoning to generate hierarchical embeddings
Evaluate on in-domain, cross-domain and fine-grained binary

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval - - PowerPoint PPT Presentation

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning

Vi Video-Te Text Cr Cros

modal Re Retrieval

Hier Hierar archic hical al Gr Grap aph Re Reasoning Mod Model (H (HGR)

Expe Experiments

Con Conclusion

selection to demonstrate model’s effectiveness

Codes and datasets will be released at: https://github.com/cshizhe/hgr_v2t