THUIR at the NTCIR-14 Lifelog-3 (LIT Task): How does lifelog help - - PowerPoint PPT Presentation
THUIR at the NTCIR-14 Lifelog-3 (LIT Task): How does lifelog help - - PowerPoint PPT Presentation
THUIR at the NTCIR-14 Lifelog-3 (LIT Task): How does lifelog help the users status recognition Isadora Nguyen Van Khan, Pranita Shrestha, Min Zhang, Yiqun Liu and Shaoping Ma Tsinghua University z-m@tsinghua.edu.cn June 12, Tokyo, 2019
Ø User’s statuses can be used for self-monitoring or as feature for other works Ø Previous researches focused on physical or health statuses Ø Continue to enhance the user’s life according to his/her personal data ØStatuses are a good indicators of both mental and physical health of a user
Introduction
2
Ø Activities analysis Ø Characterizing [Wang 2013], matching [Yamauchi 2016], annotation [Xu 2017], recognition [Doherty 2011] Ø Link lifestyle traits to groups of people [Doherty 2011] Ø Statistics to give meaningful results to users Ø Health analysis Ø Sleep, psychology, mood … [Soleimanidjan 2017-2018, Iijima 2016] Ø Detection and prediction Ø Personnalized advices Ø User Status has never been studied before in lifelogging: Ø Indoor v.s. Outdoor, Alone v.s. Not alone, Working v.s. Not working Ø Can be used as the basis of activities and health analysis
Related Work
3
Lifelog Insight task: User Status Detection
Is the user inside
- r outside?
Is the user alone or is there at least one person in his/her surroundings? Is the user working?
Ø Features
Ø Non-visual features (e.g. biometrics, activity, environment) Ø Visual based features Ø Merged features
4
Category One Sample per Image Features One Sample per Segmentation Features Users UserID UserID Biometrics Heart Rate, Calories Average heart rate, Average Calories, Sum of calories Activities Steps, Activity Sum steps, Average steps, Activity Environment Location, City, Longitude, Latitude, Time Location, City, Longitude, Latitude, Begin time, End time
Segmentation: subset of following images that describe the same scene
Features
Non Visual Features
5
{"categories": [{"name": "others_", "score": 0.046875}, {"name": "object_screen", "score": 0.85546875}], "color": {"dominantColorForeground": "Black", "dominantColorBackground": "Grey", "dominantColors": ["Grey", "Black"], "accentColor": "858246", "isBwImg": false, "isBWImg": false}, "tags": [{"name": "indoor", "confidence": 0.9856416583061218}, {"name": "computer", "confidence": 0.9771684408187866}, {"name": "desk", "confidence": 0.8973867297172546}, {"name": "keyboard", "confidence": 0.8705658316612244}, {"name": "electronics", "confidence": 0.8533153533935547}, {"name": "display", "confidence": 0.3655721843242645}, {"name": "desktop", "confidence": 0.2967212498188019}, {"name": "mac", "confidence": 0.2967212498188019}, {"name": "laptop", "confidence": 0.0638200324050663}, {"name": "office", "confidence": 0.0339242832307443}, {"name": "monitor", "confidence": 0.023137086807164493}], "description": {"tags": ["indoor", "computer", "desk", "keyboard", "electronics", "table", "monitor", "sitting", "laptop", "white", "small", "desktop", "mouse", "holding", "standing", "video"], "captions": [{"text": "a desktop computer sitting on top of a desk", "confidence": 0.888279407495522}]}, "requestId": "a32a79a3- e008-430f-a265-a5de764ee846", "metadata": {"width": 3264, "height": 2448, "format": "Jpeg"}}
Nodes: words Edges: words describe the same segmentation Markov Cluster Algorithm (S. Van Dongen 2000)
Clusters
Cooking, teeth, brushing, covered, dirty, cluttered, hot, preparing, blender, pot, pan, toothbrush Riding, sidewalk, walking, carrying Boy, child, little, baby pizza,fruit,blurry,restaurant,image,commercial, bar
Features
Visual Features
6
(By MS Vision API)
Pre-processing: cleaning/edge elimination/triangle elimination
Features
Merged Features
7
Ø Non-visual features and merged features: Ø Combination of Adaptive Boosting and C4.5 or Random Tree Ø Combination of Bagging and C4.5 or LMT Ø Random Forest Ø Visual features: Ø For each tag describing an image: we give the value
- f the cluster
Ø The main present value is considered the right one
Models
8
Ø Accuracy: Ø Use non-visual features:
Experiment Highest Accuracy (images) Highest Accuracy (segmentation*) Inside or Outside 88.6% 79.3% Alone or not Alone 72.1% 54.5% Working or not Working 79.1% 70.5%
1 𝑂#$%&&'& (
)∈#$%&&'&
𝐷#,--'#.
)
𝐷#,--'#.
)
+ 𝐷)0#,--'#.
)
*Segmentation: a series of continuous similar images
Results and Analysis
Adaboost + RT
11
Ø Accuracy: Ø Use non-visual features:
Experiment Highest Accuracy (images) Highest Accuracy (segmentation*) Effective Feature Categories Non Effective Feature Categories Inside or Outside 88.6% 79.3% 1-Time, 2-Calories, 3-Latitude/Longitude Steps and Activity Alone or not Alone 72.1% 54.5% 1-Time, 2-Latitude/Longitude 3-Heart Rate UserID and City Working or not Working 79.1% 70.5% 1-Heart Rate, 2-Time, 3-Latitude and Longitude UserID and Steps
1 𝑂#$%&&'& (
)∈#$%&&'&
𝐷#,--'#.
)
𝐷#,--'#.
)
+ 𝐷)0#,--'#.
)
*Segmentation: a series of continuous similar images
Results and Analysis
Adaboost + RT
12
Experiment Highest Accuracy Correspondent Model Inside or Outside Recognition 99.7% Bagging + LMT Alone or not Alone Recognition 66.2% Random Forest Working or not Working Recognition 76.5% Bagging + LMT
Ø Use visual features (image based):
Ø Inside/Outside recognition: 95.9% Ø Alone/Not Alone recognition: 55.1% Ø Working/Not Working recognition: 76.4%
Ø Use merged features (image-based):
Results and Analysis
13
Summary
ØFirst work on user status recognition with lifelog data ØSuccess in the status recognition
Ø Inside or outside: 99.6% with merged features Ø Alone or not alone: 72% with non-visual features Ø Working or not working: 79% with non-visual features
ØLimitations: for visual and merged features, the number
- f annotated samples is not sufficient enough
15
Summary
ØFirst work on user status recognition with lifelog data ØSuccess in the status recognition
Ø Inside or outside: 99.6% with merged features Ø Alone or not alone: 72% with non-visual features Ø Working or not working: 79% with non-visual features
ØLimitations: for visual and merged features, the number
- f annotated samples is not sufficient enough
16
More on Life Moment Search
17