WHU_NERCMS at TRECVID2018: INS Dongshu Xu, Longxiang Jiang, Xiaoyu - - PowerPoint PPT Presentation

▶

Sep 17, 2023 476 likes •703 views

WHU_NERCMS at TRECVID2018: INS Dongshu Xu, Longxiang Jiang, Xiaoyu Chai, Jin Chen, Han Fang, Li Jiao, Jiaqi Li, Shichen Lu, and Chao Liang National Engineering Research Center for Multimedia Software Wuhan university, Wuhan, 430072, China

SLIDE 1

WHU_NERCMS at TRECVID2018: INS

National Engineering Research Center for Multimedia Software

Wuhan university, Wuhan, 430072, China cliang@whu.edu.cn

Dongshu Xu, Longxiang Jiang, Xiaoyu Chai, Jin Chen, Han Fang, Li Jiao, Jiaqi Li, Shichen Lu, and Chao Liang

SLIDE 2

Category Our approach

2 3

Introduction

1

Results & conclusions

SLIDE 3

Introduction

TRECVID 2018 INS Task

Person (Jane) Scene (cafe2) Specific person in specific scene

Given person name, example images and shots
Given scene name, example images and shots
Retrieve specific person in specific scene

SLIDE 4

Our approach

2 3

Introduction

1

Results & conclusions

SLIDE 5

Framework

Reid features f_reid Local scene features Face features f_face

MTCNN SSD

f_local_scene Global scene features f_global_scene

Score fusion

… Ranking list

SLIDE 6

Local scene retrieval

Framework

SSD

Query category Input image Trained SSD network

Expected results

Input keyframes

Initial pedestrian features stage1 stage2

SLIDE 7

Global scene retrieval

Places365-CNN Network

The dataset covers 365 image scenes and also provides pre-trained models for multiple network architectures.

Resnet50

Input images Pretrained places365-CNNS

Global features Sort

SLIDE 8

Training samples of scene retrieval

Training Dataset From different views: From different objects:

cafe 2 laun

Datasets production

Keyframes are labelled with landmarks

Scene Landmarks Pub Cafe2 Laun Market

SLIDE 9

Face recognition

Face Detection Face Alignment Feature Extraction Distance Measure MTCNN

SLIDE 10

Face recognition

Face Detection Face Alignment Feature Extraction Distance Measure Similarity transformation

SLIDE 11

Face recognition

Face Detection Face Alignment Feature Extraction Distance Measure Face-ResNet

Res Block(4) Res Block(10) Res Block(6)

C P Res Block

SLIDE 12

Face recognition

Face Detection Face Alignment Feature Extraction Distance Measure Cosine distance

SLIDE 13

Face recognition

Pipeline Topic identity Extended reference identity map Gallery set

Cosine Distance

Shot 1 Shot n Shot 1 f1 f2 fn Identity Max has the highest score

for i=1:n { processing the i-th shot }

score

SLIDE 14

Person re-identification based person search

—We apply person re-id technique based on aligned re-id.

Query person examples

Person Detection (SSD)

Aligned Re-id Similarity score

Person search

rank

Global Feature (2048-d)

[1] X. Zhang, H.Luo, etc. AlignedReID: Surpassing Human-Level Performance in Person Re-Identification. arXiv:1711.08184v2, 2017

Person search Aligned Re-id [1]

SLIDE 15

Person re-identification based person search

k-means retag

face boundingbox (with id) person boundingbox (without id) person boundingbox (with id) 76 7 98 76 7 98

image set

training dataset

Number of images Number of ids Number of clusters 2,486,571 194 24864

Details of training dataset For example

How to get training dataset

SLIDE 16

Person re-identification based person search

good bad Aligned re-id Aligned re-id

probe probe rank list (Top 6) rank list (Top 6)

√ √ √ √ √ √ √ × √ × √ √ The reason for the bad query is that the clothes are too similar,

Visualization results

SLIDE 17

Score fusion

 Weight based score fusion

f_scene f_face

f topic false true

SLIDE 18

Score fusion

Face Library

assign id drop shots without target person id

Ranking

Rank with scene score

Ranking list

Detected face

filter

Person Library

assign id expand expand shots with target person id Detected person

 Face filter and person expansion

SLIDE 19

Category Our approach

2 3

Introduction

1

Results & conclusions

SLIDE 20

Results & conclusions

Auto

Interactive

Results Analysis

The ineffectiveness of reid:
IoU computation
Cluster strategy
The effectiveness of fine-tuning:
Fine-tuned on some scenes

SLIDE 21

Results & conclusions

Conclusions  The face recognition is a key method to identify person. New person search method should be introduced for person images with back and side views or in low resolution  The training dataset of scene model needs more effective images including different views of positive and negative scenes.  Score fusion and expansion method is useful to retrieve hard samples.

WHU_NERCMS at TRECVID2018: INS

National Engineering Research Center for Multimedia Software

Category Our approach

2 3

Introduction

1

Results & conclusions

Introduction

TRECVID 2018 INS Task

Category

Our approach

2 3

Introduction

1

Results & conclusions

Framework

Local scene retrieval

Framework

Global scene retrieval

Places365-CNN Network

Training samples of scene retrieval

Face recognition

Face recognition

Face recognition

Face recognition

Face recognition

Person re-identification based person search

Person re-identification based person search

Person re-identification based person search

Score fusion

 Weight based score fusion

Score fusion

 Face filter and person expansion

Category Our approach

2 3

Introduction

1

Results & conclusions

Results & conclusions

Results & conclusions

A

N K

H

T

S