VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu - PowerPoint PPT Presentation

VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu Co-authors: Jie Lin, Ana del Molino, Qianli Xu, Fen Fang, V. Subbaraju, Joo-Hwee Lim, Liyuan Li, V. Chandrasekhar Organization: Institute for Infocomm Research, A*STAR, Singapore

About VCI 2 R • Institute for Infocomm Research (I 2 R), A*STAR, Singapore – Visual Computing – Human Language Tech – Data Analytics – Neural Biomedical Tech – etc. Visual Computing Department • – Video/image analytics & search – Augmented visual intelligence – Visual inspection Website: www.a-star.edu.sg/i2r/

LSAT Framework Query Topics Image + Semantic Gap Metadata “Castle @ Night” “ Working in a coffee shop ” “Gardening in my home” Relevant concepts : What are the • Training Images Offline CNN predications relevant to query Feature weight Relevant concepts topics? Object w 1 Classifier Feature weighting : Which features • CNN Places w 2 Classifier contribute the most? Object Faster RCNN w 3 Detector Query Temporal NTCIR-13 Lifelog Temporal smoothing : Temporal w 4 • Topic Smoothing Classifier Images coherence, remove outliers w 5 Time tag User-given … … w 6 Loc tag Post filtering : refine search using • location (GPS) and Time w 7 # People Online del Molino, et al., 2017, VC-I2R at ImageCLEF2017: Ensemble of deep learned features for lifelog video summarization. CLEF Working Notes , CEUR .

1. Getting the Basic Semantics • CNN classifiers – Object: ResNet152 – ImageNet1K – Place: ResNet152 – Place365 • CNN detector – Faster R-CNN – MSCOCO (80) • NTCIR-13 classifier – VGG-16 – ImageNet1K – Replace the last layer (1K neurons) with 634 neurons – Sigmoid as the activation function • Human detection and counting – Sighthound (https://www.sighthound.com)

2. Aggregating & Weighing Features Relevance mapping for each topic Training Images Objects Places MSCOCO Feature weight Relevant concepts Task Relevant Avoid Relevant Avoid Relevant 1 computer - computer - laptop w1 group meeting group meeting keyboard ImageNet1K etc. w2 2 television computer living room conference room tv food group meeting television room lecture room remote glass etc. etc. etc. w3 Places365 3 computer o ffi ce co ff ee shop conference room laptop group meeting living room o ffi ce keyboard w4 etc. etc. 4 computer o ffi ce living room conference room laptop w5 pencil hotel room o ffi ce book MSCOCO notebook etc. etc. etc. w6 5 food drum food court - fork glass white goods restaurant sandwich w7 menu’ etc. etc. NTCIR w8 w9 CRF for Feature weighing that Time w10 accommodates individual differences w11 # People X X E θ ( s ) = λ φ u ( s i ) + φ p ( s i , s j ) , | {z } w12 Location tag | {z } i ij unary pairwise the unary potentials enforce the selection of static

4. Post-filtering 3. Temporal Smoothing • Adjacent lifelog images may • Increase diversity of retrieved share similar event. images (avoid retrieving images of the same event) • Temporal smoothing is used to ensure the semantic • Use time and location (GPS) to coherence. filter images • A triangular window of size • Exclude images that are closer w is used. w is adaptive to in time and location. event topics.

Result • Official score (precision): 57.6% 1 User 1 User 2 0.8 0.6 mAP 0.4 0.2 0 Eat Lunch Coffee Graveyard Working Late Juice Work w Coffee Painting Walls Eating Pasta Exercises Turtles Gardening Castle at Night Sunset Lecturing Shopping On Computer Cooking Flying Photo of Sea Beers in Bar Greek Amphit TV Recording Mountain Hiking

Analysis (Fine-tuning) 0.9 0.826 0.9 0.9 0.789 0.761 0.748 User 2 0.761 0.8 0.8 User 2 0.654 0.8 0.7 0.7 User 1 0.528 0.543 0.6 0.502 0.528 0.6 0.7 mAP 0.5 User 1 mAP User 1 0.5 mAP 0.4 User 2 0.4 0.6 0.3 0.3 0.2 0.2 0.5 0.1 0.1 0 Fixed Adaptive Adaptive 0.4 0 All − NTCIR − 13 − ImageNet1K − Places365 − MSCOCO − Location − Time − #People (User) (User + No smoothing Temporal Event) smoothing Feature importance Effect of threshold for Effect of temporal relevant concept searching smoothing Decrease in Semantic concepts which Whether temporal smoothing performance when we activation level is above the is performed or not remove one type of threshold is considered relevant feature. The bigger the to the query topic decrease, the more important the feature.

Summary • A lot of fine-tuning and manual intervention are Reasonable involved in the retrieval à Ground Truth Intelligence in Good Over-fitting? Interpretation Semantic of Query Features • “Relevant” concepts may not Topics be contributing, and vice verse . Effective Intelligence in Lifelog High Quality Model Fine- • Interactive retrieval is Data Image tuning Retrieval probably a good intermediate solution. LIT Email: qxu@i2r.a-star.edu.sg

VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu - PowerPoint PPT Presentation

VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu Co-authors: Jie Lin, Ana del Molino, Qianli Xu, Fen Fang, V. Subbaraju, Joo-Hwee Lim, Liyuan Li, V. Chandrasekhar Organization: Institute for Infocomm Research, A*STAR,

VCI 2 R at the NTCIR-13 Lifelog-2 LIT Task Presented by: Qianli Xu Co-authors: Qianli Xu, V.

THUIR at the NTCIR-14 Lifelog-3 (LIT Task): How does lifelog help the users status recognition

Smart Lifelog Retrieval System with Habit-based Concepts and Moment Visualization QUIK team

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

HCMUS at the NTCIR-14 Lifelog-3 Task Nguyen-Khang Le, Dieu-Hien Nguyen, Trung-Hieu Hoang,

LSAT Information Session University of San Diego Test Preparation Course Prepare for an

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham,

SG01 at the NTCIR-13 STC-2 task Haizhou Zhao , Yi Du, Hangyu Li, Qiao Qian, Hao Zhou, Minlie

DDD with CQRS and Redux Domain Driven Design Cologne/Bonn Meetup / 17.07.17 Christoph Baudson /

Positive extensions of Schur multipliers Ying-Fen Lin Queens University Belfast (joint work

Constant-Time Retrieval with Polynomially-small Slack Martin Dietzfelbinger Stefan Walzer

COVID: UNA MIRADA MACROECONMICA Ivn Werning, MIT FEN-UC Julio 2020 RESEARCH ON COVID:

WRE Technical Programme Steve Moncaster Technical Director Norfolk & Suffolk Stakeholder

classifier Sutanu Gayen Drawbacks of state-of-the art chess engines Contd.. Rule of square:

Lightning Introductions Sociotechnical Cybersecurity Workshop 2 August 8-9, 2017 Rebecca Annis

L L ICE (0 0 C) FIRE (100 0 C) L L ICE (0 0 C) FIRE (100 0 C) (i) u=1 m/s, T=40 0 C (iii)

Sambuz

Useful Links

Newsletter

Mail Us

VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu - PowerPoint PPT Presentation

VCI 2 R at the NTCIR-13 Lifelog-2 LSAT Task Presented by: Qianli Xu Co-authors: Jie Lin, Ana del Molino, Qianli Xu, Fen Fang, V. Subbaraju, Joo-Hwee Lim, Liyuan Li, V. Chandrasekhar Organization: Institute for Infocomm Research, A*STAR,

VCI 2 R at the NTCIR-13 Lifelog-2 LIT Task Presented by: Qianli Xu Co-authors: Qianli Xu, V.

THUIR at the NTCIR-14 Lifelog-3 (LIT Task): How does lifelog help the users status recognition

Smart Lifelog Retrieval System with Habit-based Concepts and Moment Visualization QUIK team

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

HCMUS at the NTCIR-14 Lifelog-3 Task Nguyen-Khang Le, Dieu-Hien Nguyen, Trung-Hieu Hoang,

LSAT Information Session University of San Diego Test Preparation Course Prepare for an

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham,

SG01 at the NTCIR-13 STC-2 task Haizhou Zhao , Yi Du, Hangyu Li, Qiao Qian, Hao Zhou, Minlie

DDD with CQRS and Redux Domain Driven Design Cologne/Bonn Meetup / 17.07.17 Christoph Baudson /

Positive extensions of Schur multipliers Ying-Fen Lin Queens University Belfast (joint work

Constant-Time Retrieval with Polynomially-small Slack Martin Dietzfelbinger Stefan Walzer

COVID: UNA MIRADA MACROECONMICA Ivn Werning, MIT FEN-UC Julio 2020 RESEARCH ON COVID:

WRE Technical Programme Steve Moncaster Technical Director Norfolk &amp; Suffolk Stakeholder

classifier Sutanu Gayen Drawbacks of state-of-the art chess engines Contd.. Rule of square:

Lightning Introductions Sociotechnical Cybersecurity Workshop 2 August 8-9, 2017 Rebecca Annis

L L ICE (0 0 C) FIRE (100 0 C) L L ICE (0 0 C) FIRE (100 0 C) (i) u=1 m/s, T=40 0 C (iii)

Sambuz

Useful Links

Newsletter

Mail Us

WRE Technical Programme Steve Moncaster Technical Director Norfolk & Suffolk Stakeholder