Analyzing large multimedia collections in an urban context MSc. VU - - PDF document

analyzing large multimedia collections in an urban context
SMART_READER_LITE
LIVE PREVIEW

Analyzing large multimedia collections in an urban context MSc. VU - - PDF document

12-7-2016 Analyzing large multimedia collections in an urban context MSc. VU computer Science Amsterdam Data Science PhD: UvA Informatics Institute Now: 0.8fte Informatics Institute Marcel Worring 0.2fte Amsterdam Business School Marcel


slide-1
SLIDE 1

12-7-2016 1

Amsterdam Data Science Marcel Worring

Marcel Worring

Analyzing large multimedia collections in an urban context

Marcel Worring Stevan Rudinac, Jan Zahalka, Dennis Koelma Joost Boonzajer Flaes, Jorrit van den Berg

Informatics Institute, Amsterdam Data Science

  • MSc. VU computer Science

PhD: UvA Informatics Institute Now: 0.8fte Informatics Institute 0.2fte Amsterdam Business School Associate Director Amsterdam Data Science

Amsterdam Data Science Objective and Subjective data

Image data Numeric data Geographic data Structured data Unstructured data Temporal data Textual data

Open data

Open Data

Geo location .,. Amsterdam, Netherlands Exif .,. Camera: Nikon N60 .,. Focal length: 55 mm .,. Exposure time: 1/200 .,. Flash: off Author .,. josemanuelerre (Flickr) .,. Jose ´ Manuel R´ ıos V aliente Tags .,. cyclist .,. bike .,. street Comments .,. “I love Amsterdam! great photo!” .,. “Great compostion, beautiful B&W!!” .,. “Estupendo B&N, bella imagen.” . . .

Data Sources

slide-2
SLIDE 2

12-7-2016 2

.,. “Koningsdag, or ‘King’s Day

,’ is one of the principal holidays of the Netherlands. . . ”

.,. In this case, the image says more than the text

Photo: quantz @ Flickr

Data Sources Objective and Subjective data

Open data

Open Data

+ Content Analysis

WHAT DOES IS BRING?

Professional Recommender Systems

Recommender system for tourists

11

Touristic Routing

slide-3
SLIDE 3

12-7-2016 3

City Sentiment City Marketing Analytics

ALGORITHMS

Ranking of data

Some query defines starting point and order Result Best Worse An image/video/text collection

For Social Media

  • The Ranking can be based on

– The objective content of the comments – The subjective content of the comments – The objective visual content – The subjective visual content – ………

  • Or any combination of the above

Concept detection

Learn model Visual examples Positive negative Unknown images Score of presence

  • > ranking
slide-4
SLIDE 4

12-7-2016 4

Zebu

Requires annotation to learn

Animals People Lions Lemurs

What do we learn?

14,197,122 images, 21841 synsets indexed 1200 trained visual concept detectors for adjective-noun pairs

The new trend: Deep learning

Krishevsky NIPS 2012

Start with raw pixels, learn all parameters

The learned filters

Zeiler and Fergus

The layered network

Krishevsky NIPS 2012

Convolution + pooling + fully connected layers +

  • utput layers

60.000.000 parameters to learn

But what do all these layers do?

slide-5
SLIDE 5

12-7-2016 5

Visualizing deep networks

Zeiler and Fergus

Visualizing deep networks Visualizing deep networks Visualizing deep networks State-of-the-art: GoogleNet

and growing …… Makes image search keyword driven

Text Analysis

  • D. Blei, 2003

Latent Dirichlet Allocation

Latent Dirichlet Allocation

slide-6
SLIDE 6

12-7-2016 6

Latent Dirichlet Allocation

  • D. Blei, 2003

.,. Generative model, discovers topics and scores them .,. 100 topics are enough to sufficiently cover entire

Wikipedia

.,. Input: Raw text .,. Output: T

  • pic scores per document

0.054*mexico + 0.049*forest + 0.024*argentina + 0.022*islands + ...+ 0.014*aires

Latent Dirichlet Allocation

We treat comments or sets tags as documents

VENUE RECOMMENDER

.,. Venue recommendation — suggesting places of interest

(venues) based on user preferences

.,. The classic approach is collaborative filtering utilizing the

user-item matrix

The task

.,. City Melange — a venue explorer utilizing multimedia

analytics techniques

.,. Content-based — based solely on the content of

venue-related social media

.,. Multimodal — combining content from images and the

associated text

.,. Interactive — user preferences are modelled on the fly

as you explore the city

.,. Cross Platform — integrates data from diverse social

platforms

City Melange Characteristics

Venue information Venue images Images, metadata User data Q(venue name,geo)

Data Gathering

slide-7
SLIDE 7

12-7-2016 7

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Data Analysis

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Data Analysis

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Clustering

Processed data

VT

V

Visual venue topics

Data Analysis

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Clustering

Processed data

VT

V

Visual venue topics Visual user topics

VT

U

Data Analysis

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Clustering

Processed data

VT

V

VT

U

Visual venue topics Visual user topics T ext venue topics

T

V T

Data Analysis

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Clustering

Processed data

VT

V

VT

U

T

V T

Visual venue topics Visual user topics T ext venue topics T ext user topics

T

U T

Data Analysis

slide-8
SLIDE 8

12-7-2016 8

Content

V

Images

T

T ags Comments . . .

VC

Venues Users

U

Features

VF

ConvNet

TF

LDA

Clustering

Processed data

VT

V

T T

U T

Visual venue topics Visual user topics T ext venue topics T ext user topics User-venue matrix

VT

U V T

UV

Data Analysis

.,. ACM Multimedia Grand Challenge 2014 1st Prize .,. newyorkermelange.com VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking

Interactive Recommendation

slide-9
SLIDE 9

12-7-2016 9

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample Linear

US

SVM User ranking Suggested users

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample Linear

US

SVM User ranking Suggested users

Venue ranking

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample Linear

US

SVM User ranking Suggested users

Venue ranking

Venue ranking

VS

Suggested venues

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample SVM User ranking Linear

US

Suggested users

Venue ranking

Venue ranking

VS

Suggested venues

(US,VS)

Suggestions

Interactive Recommendation

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample SVM User ranking Linear

US

Suggested users

Venue ranking

Venue ranking

VS

Suggested venues

(US,VS)

Suggestions

Map

Interactive Recommendation

slide-10
SLIDE 10

12-7-2016 10

VT ,TT

V V Venue topics

VT ,TT

U U User topics Users U

UV

User-venue matrix

Grid

Rel. venues

VT ,TT

+ + Positives

User ranking V− ,T −

T T Negatives Rand. sample Linear

US

SVM User ranking Suggested users

Venue ranking

Venue ranking

VS

Suggested venues

(US,VS)

Suggestions

Map

Relevance indication

Interactive Recommendation Recommender system for tourists

56

  • 1. Can we recommend the right type of venue?
  • 2. Can we recommend mainstream venues to mainstream

tourists and specialized venues to afficionados?

Evaluation

.,. 621 fine-grained venue types (Japanese restaurant,

skate park. . . )

.,. 100 artificial actors, use 75% of the data to seed Melange .,. Perform 10 interaction rounds

Evaluation

  • .,. City Melange
  • .., Visual modality only
  • .., T

ext modality only

  • .., Multimedia (vis + txt)
  • .,. Recommender baselines
  • .., WRMF — Weighted regularized matrix factorization
  • .., BPRMF — Bayesian personalized ranking matrix

factorization

  • .,. Popularity ranking (PopRank) — most visited

venues according to Foursquare

Methods Compared

.,. New Y

  • rk — 1.07M images and associated text from

Foursquare, Flickr, and Picasa

.,. Amsterdam — 56K images and associated text from

Foursquare and Flickr

Data Collection

slide-11
SLIDE 11

12-7-2016 11

1 2

poprank melange_vis

3 4 5 6 Interaction round 7 8 9 10 0.0 0.6 0.5 0.4 0.3 0.2 0.1 Venue type precision

bprmf melange_txt wrmf melange_mm

New York

1 2

poprank melange_vis

3 4 5 6 Interaction round 7 8 9 10 0.0 0.2 0.4 0.6 0.8 1.0 Venue type recall

bprmf melange_txt wrmf melange_mm

New York

1 2

poprank melange_vis

3 4 5 6 Interaction round 7 8 9 10 0.0 0.6 0.5 0.4 0.3 0.2 0.1 Venue type precision

bprmf melange_txt wrmf melange_mm

Amsterdam

1 2

poprank melange_vis

3 4 5 6 Interaction round 7 8 9 10 0.0 0.2 0.4 0.6 0.8 1.0 Venue type recall

bprmf melange_txt wrmf melange_mm

Amsterdam

0.0 0.2 0.4 0.6 0.8 1.0

True user-venue distribution density

0.2 0.1 0.0 0.1 melange 0.2 0.3

Density difference mm wrmf bprmf poprank

Distribution of recommendations

TOURIST ROUTING

slide-12
SLIDE 12

12-7-2016 12

SceneMash

  • Data collection
  • 150,000 geotagged Flickr and Foursquare

images from the region of Amsterdam

  • Metadata associated

with the images:

  • image title
  • description
  • tags
  • geotags

SceneMash SceneMash SceneMash

Demo

CITY SENTIMENT

Data Collection

64K GeoTagged Tweets with Images

Various neighborhood statistics

(17 variables) 64K GeoTagged Images and comments Amsterdam Neighborhoods

slide-13
SLIDE 13

12-7-2016 13

Methodology Sentiment Maps

Sentiment analysis

Sentiment Maps

Sentiment analysis

Finding correlations

textual and visual content textual and visual content various statistics

Sentiment analysis

Correlation Analysis

Correlations

Flickr Twitter Correlations are only found with multimodal sentiment

Redefined Neighborhoods

People with similar social media interests

slide-14
SLIDE 14

12-7-2016 14

MARKETING ANALYTICS

WHAT WE HAVE

“The purpose of computing is insight, not numbers.” Richard Hamming 1962

So what we want?

Insight

What is insight?

Insight Complex

Insight is complex, involving all or large amounts of the given data in a synergistic way, not simply individual data values.

Deep

Insight builds up over time, accumulating and building on itself to create depth often generating further questions and, hence, further insight.

Qualitative

Insight is not exact, can be uncertain and subjective, and can have multiple levels of resolution.

Unexpected

Insight is often unpredictable, serendipitous, and creative.

Relevant

Insight is deeply embedded in the data domain, connecting the data to existing domain knowledge and giving it relevant meaning going beyond dry data analysis, to relevant domain impact.

North CG&A, 2006

“Computers are incredibly fast, accurate, and

  • stupid. Humans are incredibly slow,

inaccurate and brilliant. The marriage of the two is beyond imagination” Leo Cherne 1968

slide-15
SLIDE 15

12-7-2016 15

Visual Analytics

  • Combine the power of computer and human
  • Compute power
  • Storage capacity
  • Flexibility
  • Creativity
  • Expert knowledge

Definition

Multimedia Analytics = Multimedia Analysis + Visual Analytics

Ref:Chinchor2010

Multimedia Analytics

INSIGHT

Analytics

  • What is the best known Analytic tool?

Yes the Spreadsheet

Analytics

Fischer et.al, TVCG 2010.

MediaTable

Columns denote concept scores can be used for sorting Colors denote categories and buckets are used to collect elements

  • f (sub-) category

Heatmap like visualization Grey values denote values between 0 and 1 Allows to see correlations Filters/sort order can be specified Refs: deRooij2010b, deRooij2013

slide-16
SLIDE 16

12-7-2016 16

Multimedia Pivot Tables

ROW VARIABLE: Decompose FILTER VARIABLES: Define active data set

Concepts Tags Nominals

COLUMN AGGREGATION

Integers

COLUMN VARIABLES: Sort and Weight VALUE VALUE VALUE VALUE ROW AGGREGATION

Visualizations

Type Filter Column Row Value Visualization Images Selection to bucket x Individual images Sorted list of images Nominal Label selection x Individual labels Sorted and weighted text histogram Buckets Bucket selection x Individual buckets Weighted histogram Geo Selection to bucket x x Map with weighted elements Numeric Range selection Weights 7-point summary Sum, max, avg, weighted distribution Concepts Range selection Weights 7-point summary Weighted distribution Tags Tag selection Weights Individual tags Sorted and weighted tag histogram

Statistics driven decomposition Column aggregation Row aggregation Top-N Concepts Row specific concepts Concept based sorting Relevance based sorting

slide-17
SLIDE 17

12-7-2016 17

BM-25 BASED RANKING

Demo

https://staff.fnwi.uva.nl/m.worring/pivot-tables.html

Learning from interaction Employing user interaction

pos neg

Selection of pos/neg examples

Some elements in the collection are labeled Many are not

slide-18
SLIDE 18

12-7-2016 18

Employing user interaction

User Pool-Query Set Labeled Resultant set Learning Algorithm Interactive Learning Strategy Active Learning

Chen in 2005 was the first to explore this for Video Retrieval

Relevance feedback Ref: Huang2008

Relevance feedback

Try to find boundary in feature space best separating positive from negative examples F F1 F

2

Measure of class membership probability

Relevance feedback

In the next iteration I will have more samples hence a better estimate

  • f the boundary

F F1 F

2

This process is usually known as relevance feedback

Active Learning

In active learning the system decides which elements to show for feedback and which not.

F F1 F

2

For the system it is relevant to know this label The system can safely assume this sample is also negative

Automatic AND interactive

SVM based relevance feedback Interactive categorization

Three interactive strategies

  • Fully interactive

– User is interactively performing the sort/select/categorize process

  • Manual relevance feedback

– In addition to the above the user can perform relevance feedback on any of the categories

  • Unobtrusive relevance feedback

– In addition to the above the system automatically indicates new potentially relevant elements

slide-19
SLIDE 19

12-7-2016 19

Fully interactive On demand suggestions

After categorizing some elements Learn and apply model for user selected bucket Uncategorized images Category suggestions

Unobtrusive assistance

Continously observe what happens Learn and apply model for system selected bucket Uncategorized images Category suggestions

Results: elements found

  • significant at the p=0.01 level compared to baseline
  • significant at the p=0.01 level compared to manual

Task 1: specific, high visual similarity Task 2: generic visually diverse, concept available Task 3: generic visually diverse, concept available Task 4: generic visually diverse, no concept available

SCALABILITY

slide-20
SLIDE 20

12-7-2016 20

[Zahálka and Worring, VAST 2014]

B.P. Jonsson et.al. MMM 2016

WRAP-UP

Objective and Subjective data

Image data Numeric data Geographic data Structured data Unstructured data Temporal data Textual data

Open data

Open Data

The applications The Algorithms

And its variations

slide-21
SLIDE 21

12-7-2016 21

www.amsterdamdatascience.nl m.worring@uva.nl