High Level Semantic Modeling Shih Fu Chang Digital Video Multimedia - - PowerPoint PPT Presentation

high level semantic modeling
SMART_READER_LITE
LIVE PREVIEW

High Level Semantic Modeling Shih Fu Chang Digital Video Multimedia - - PowerPoint PPT Presentation

High Level Semantic Modeling Shih Fu Chang Digital Video Multimedia Lab, Columbia University CVPR Tutorial, June 2014 1200 SentiBank Predict Sentiment Concepts Interestingness Aesthetics Beyond Semantics Style Emotion Others:,


slide-1
SLIDE 1

High‐Level Semantic Modeling

Shih‐Fu Chang

Digital Video Multimedia Lab, Columbia University CVPR Tutorial, June 2014

1200 SentiBank Concepts Predict Sentiment

slide-2
SLIDE 2

Beyond Semantics

Aesthetics Interestingness Emotion Style

Others:, Creativity, Intent, Memorable …

slide-3
SLIDE 3

Difficult Problems ‐ but interesting datasets & results emerging

slide-4
SLIDE 4

Visual Aesthetics

  • Datta et al ECCV 2006, Naila Murrary et al CVPR 2012 (AVA)
  • AVA Dataset: 250,000 images from 963 dp‐challenges with

aesthetics scores and semantic/style labels

slide-5
SLIDE 5

Aesthetics is Subjective

‐ Non‐Conventional Style/Subject Tends to Cause Large Score Variations

Murrary et al CVPR 2012 (AVA)

slide-6
SLIDE 6

Score Distributions of Each Image Vary but Form Patterns

Murrary et al CVPR 2012 (AVA) Gaussian distribution a reasonable fit

slide-7
SLIDE 7

Semantics Also Plays An Important Role

Murrary et al CVPR 2012

‐ Many less attractive classes are associated with negative semantics

slide-8
SLIDE 8

Despite these… Big Dataset/Complex Model Help

Fisher Vector #Components

Murrary et al CVPR 2012

slide-9
SLIDE 9

What makes video pleasing

– NHK 1000 Videos Aesthetics Ranking at ACMMM13

9

Cinematographic Evaluation Additional Filter for Web video search Personal Video Collection

Aesthetically Pleasing Not‐so Pleasing

(Bhattacharya et al, ACMMM 2013)

slide-10
SLIDE 10

Computational Video Aesthetics

Input Video Shot Keyframe Cell

  • Camera motion
  • Foreground

Motion

  • Texture Dynamics
  • Semantics
  • Sentiments
  • Sharpness
  • Eye Sensitivity
  • Dark Channel

Fused Aesthetic Model Query Video Predicte Appeal

Shot Level Aesthetic Models Frame Level Aesthetic Models Cell Level Aesthetic Models

(Bhattacharya et al, ACMMM 2013) (Subh Bhattacharya et al, ACMMM 2013)

slide-11
SLIDE 11

Predicted Ranking of Video Aesthetics

(Bhattacharya et al, ACMMM 2013)

slide-12
SLIDE 12

S.F. Chang 12

Modeling Interestingness (Flickr Explore Rank Order)

Sagnik Dhar et al, 2011

slide-13
SLIDE 13

Which Images are More Memorable (Philip Isola et al, 2011)

S.F. Chang 13

+ person, floor, car ‐ sky, tree, mountain

slide-14
SLIDE 14

S.F. Chang 14

Interestingness vs. Memorability vs. Aesthetics

Michael Gygli et al, 2013

slide-15
SLIDE 15

For Content to be Viral, it Needs to be Emotional

Psychology emotion wheel (8 emotions, by Robert Plutchik)

15

Plenty on the Web: “For content to go viral, it needs to be emotional,” Dan Jones ‐ Dan Jones

slide-16
SLIDE 16

The Power of Social Visual Multimedia

@BarackObama: Four more years. @Brynn4NY: Rollercoaster at sea.

2012 Tweets of the Year

16

slide-17
SLIDE 17

17

Classifying Image Emotions

Machajdik and Hanbury, ACMMM 2010

IAPS Affect Data set Art Affect Data set

slide-18
SLIDE 18

How Do People Describe Emotions in Web Photos?

‐‐ Web mining to discover visual emotions in social media

Build Sentiment Ontology Psychology emotion wheel (8 emotions) Robert Plutchik, ‘91 Discover sentiment words

Select Concepts

SAD EYES MISTY WOODS

18

Analyze tags with strong sentiments

Borth, Ji, Chen, Breuel, Chang, Large‐Scale Visual Sentiment Ontology, ACM Multimedia 2013

slide-19
SLIDE 19

Concurrent tags with emotions

S.F. Chang 19

From 6 million tags on Flickr and YouTube Color code: text sentiment values

slide-20
SLIDE 20

Frequent Photo Tags Related to Emotions

S.‐F. Chang

slide-21
SLIDE 21

From Machine Vision Perspective: Not all concepts/entities are detectable! ‐‐ which 1000 concepts to focus in pictures?

slide-22
SLIDE 22

Computational Focus – Adjective‐Noun Pair (ANP)

  • Adjective (268): express emotions
  • positive: beautiful, amazing, cute
  • negative: sad, angry, dark
  • Nouns (1187): possible detection
  • people, places, animals, food, objects, weather
  • Standard steps:

– remove entities like “hot dog” via wikipedia – Choose sentiment rich ANP concepts by NLP tools “Senti‐WordNet” “SentiStrength”

S.F. Chang 22

slide-23
SLIDE 23

ANP Ontology (noun)

  • 6 levels

– ANIMALS – FLORA – PERSON – OBJECTS – NATURAL PLACES – MAN‐MADE PLACES – VEHICLE – FICTIONAL_CREATURES – FOOD – ABSTRACT_CONCEPTS – ART_PHOTOGRAPHY – EVENTS – ACTION – WEATHER_CONDITIONS – TIME

S.F. Chang 23

slide-24
SLIDE 24

ANP Ontology (adjective)

  • adjectives (2 levels)

– weather (stormy, cold, sunny) – people (young, attractive) – animals (cute, fluffy) – places (haunted, misty) – food (yummy, salty) – object (colorful, beautiful)

S.F. Chang 24

slide-25
SLIDE 25

Visual Sentiment Ontology (Browser)

slide-26
SLIDE 26

Visual Sentiment Ontology (Browser)

slide-27
SLIDE 27

Open Issues …

  • How will the visual sentiment ontology change
  • ver different domains?

– Differences in photo style, quality, user groups, culture, tasks

  • How to link mid‐level concepts to high‐level

emotions?

– currently based on association

slide-28
SLIDE 28

Next Step: Teach Machine to Recognize Visual Sentiments

SentiBank (1200 Detectors) Build Sentiment Ontology Psychology emotion wheel (24 emotions) Discover sentiment words

Select Adj‐Noun Pairs

Train Classifiers Performance Filtering Sentiment Prediction

SAD EYES MISTY WOODS

S.F. Chang 28

slide-29
SLIDE 29

Image Features

  • Generic features

– Color Histogram (3x256 dim.) – GIST descriptor (512 dim.) – Local Binary Pattern (52 dim.) – SIFT Bag‐of‐Words (1,000 codewords 2‐layer spatial pyramid, max pooling) – Attribute descriptor (2,000 dim.)

  • Special features

– Object detection (people, objects, etc.) – Aesthetics features (color schemes, layout, etc.) – Face and attributes – Improve accuracy 9%‐30%

S.F. Chang 29

slide-30
SLIDE 30

Aes Aestheti hetic Fe Feat atures

  • Dark Channel [He et al. ‘09]
  • minimum of local intensity
  • Sharpness [Vu et al. ‘11]
  • sharpness of local image regions

by spectral and spatial measures

  • Depth of Field
  • wavelet decomposition in HSV

color space (low vs high)

  • Color Harmony [Nishiyama et al. ‘11]
  • Using local histogram of Moon‐

Spencer model, which defines compatibility of two color values (example of compatible color)

slide-31
SLIDE 31

Wh What at Do Do Hum Humans ans Expect Expect to to See? See?

‐ from small annotation experiments

masochismtango@flickr houseofduke@flickr HIKARU Pan@flickr Jules3000@flickr springlakecake@flickr ebonique2007@flickr hurlham@flickr houseofduke@flickr

“Smiling dog”: tongue visible, mouth open, face camera, close shot, pink tongue, open mouth, frontal dog face “Tired dog”: lying on floor or surface, closed eyes, yawning, resting, no action, fore legs, paws, face on floor

slide-32
SLIDE 32

So So, Need Need to to Link Link to to Obj Object cts + At Attributes

Training Images of Same Noun (e.g. dog)

Object (noun) Detector HOG, DPM

Feature Extraction

  • bject/background/whole
  • SIFT, GIST, LBP, color
  • aesthetics (symmetry,

white balance, etc.)

  • composition (object

size, position)

Discard

Soft Adj. Labels

ConceptNet, SentiStrenth, human labels

Weighted SVMs: ANP Classifiers + Feature selection:

Yes No

cute sad wet

Adjectives:

cute dog sad dog wet dog

SamFan1@flickr Bahman Farzad@flickr zoompict@flickr BloodyGoku21@flickr

slide-33
SLIDE 33

Testing Images

Hi Hier erar archi chical al: Obj Object ct + Af Affect ct Attri tribut butes es

  • Testing

dog?

Candidates + Features

cute dog? sad dog? wet dog?

ANP Classifiers:

face? car?

Candidates + Features

mad face? silly face? face?

Candidates + Features

hot car? tiny car? safe car? Fuse Noun Score:

Max Score Output:

sweet

epSos.de@flickr Karf Oohlu@flickr rollinoldman@flickr green_lover@flickr NiH@google+ houseofduke@flickr paevalill@flickr ccdoh1@flickr flatworldsedge@flickr

slide-34
SLIDE 34

Tricky Issue: Concept Subjectivity

  • Attribute s subjective,

ambiguous, and overlapped

– E.g., cute dog, fluffy dog, cuddly dog

  • Solution

– Need a way to handle soft label overlap – Model overlap proportion in SVM

Cute dog Tiny dog Fluffy dog

slide-35
SLIDE 35

The S VM Algorithm

35

  • F. Yu; D. Liu; S. Kumar; T. Jebara; S.‐F. Chang. ∝SVM for learning with label proportions. ICML13

Label prediction loss proportion loss

  • Learned with alternate optimization or a relaxed convex form
  • Formulation:

Image set of “Fluffy Dog” has proportion pk being “Cut Dog” Image set of “Tiny Dog” has proportion pk being “Cut Dog” proportions can be approximated by ConceptNet

slide-36
SLIDE 36

Example: S

VM for Video Event Recognition

  • Model the proportion of positive instances in each event
  • Detecting complex events in ~ 100,000 videos with 20% gain

K.‐T. Lai; F. X. Yu; M.‐S. Chen; S.‐F. Chang. CVPR 2014

slide-37
SLIDE 37

Sen SentiBank iBank 2: 2: Obj Objects w.

  • w. Attri

tribut butes es

SentiBanks with Object/Attribute SentiBank 1

slide-38
SLIDE 38

Detection Examples

S.F. Chang 38

slide-39
SLIDE 39

VSO/SentiBank Resources

Ontology and 1,200 Classifiers

http://visual‐sentiment‐ontology.appspot.com/

Shih‐Fu Chang 39

slide-40
SLIDE 40

1200 Classifiers Predict Sentiment

Application: Live Sentiment Prediction

True stuff. I have mad respect for all the ladies that DO NOT give in to abortion. #groundzero #hurricanesandy #newjersey True stuff. I have mad respect for all the ladies that DO NOT give in to abortion. #groundzero #hurricanesandy #newjersey Ouch mr police man PhotoTweet Stream:

40

@nickespo89 @charleslawrence @radiodario

Positive? Neutral? Negative?

slide-41
SLIDE 41

Viewer Response Depends …

  • Responses depend on viewer’s perspective
  • Mechanic Turk sentiment labeling over 2000 photo tweets

True stuff. I have mad respect for all the ladies that DO NOT give in to abortion.

Amazon Mechanic Turk Sentiment/Emotion Label: (image‐based labeling) worker 1: Positive, trust:acceptance worker 2: Neutral, interest:unlabeled,sad:pensiveness worker 3: Positive, interest:interest (text‐based labeling) worker 1: Positive, joy:serenity,trust:acceptance worker 2: Positive, anger:neutral,interest:interest,joy:serenity,trust:acceptance worker 3: Negative, sad:sadness (text‐image‐based labeling) worker 1: Positive, joy:serenity,sad:neutral worker 2: Positive, interest:interest,joy:joy,sad:neutral,surprise:distraction worker 3: Positive, joy:serenity,surprise:neutral,trust:trust

S.F. Chang 41

@nickespo89

slide-42
SLIDE 42

Response also Depends on Topic

  • Viewer disagreement varies across topics
  • Text more controversial than image in invoking responses

S.F. Chang 42

% sentiment labels disagreed by all viewers

slide-43
SLIDE 43

Phot Photo Tw Tweet Sen Sentim imen ent Tr Tracking

(during (during Hurri Hurricane ane Sandy) Sandy)

  • Goal: track sentiment evolution during natural disaster
  • Data collection:
  • Oct 25 – Nov 02, 2012
  • Related popular hashtags : #prayforusa, #frankenstorm,

#nyc,#hurricane,#sandy,#hurricanesandy, #staysafe, #redcross,#myheartgoesouttoyou,…

  • 2000 Photo Tweets collected
  • Ground Truth Labeling:
  • 1340 labels (positive or

negative) agreed by 2 annotators

  • Training Classifier:
  • Text (SentiStrength)
  • Visual(SentiBank, Logistic Regr.)
  • Training/Testing ratio: 4:1
  • 5‐fold cross‐validation
  • Accuracy: 60.7% (text), 66.4% (visual), 72% (Text‐Visual Combined)

S.F. Chang 43

slide-44
SLIDE 44

Publisher (expressed) vs. Viewer (evoked) Affects

S.F. Chang 44

Discover sentiment words SAD EYES MISTY WOODS BEAUTIFUL DOG MOODY

Publisher Viewer

slide-45
SLIDE 45

Viewer Affect Concepts (VAC)

S.F. Chang 45

Popular Responses Responses for “SAD” images

What viewers say about images of different emotions?

slide-46
SLIDE 46

Publisher Affect vs. Viewer Response

  • How do Publisher Affect Concepts evoke different Viewer

Affect Concepts?

46

  • Great. now i’m hungry.

Looks so delicious!!

PAC‐VAC Correlation Models

Viewer Affect Concept (VAC)

delicious

hungry yummy

nice happy

tasty

fat

cool Publisher Affect Concept (PAC)

yummy meat

traditional celebration

hot food delicious food

great food

  • utdoor party

SentiBank [Borth, ACM MM’13]

slide-47
SLIDE 47

Probabilistic PAC‐VAC Correlation Models

47

P(vj | di;)  P(vj |)P(di | vj;) P(di |) P(di | vj;)  (P(pk | di)P(pk | vj;

k1 A

)(1 P(pk | di))(1 P(pk | vj;)))

  • Recommending images by Multivariate Bernoulli formulation:

P(pk | vj;)  BikP(vj | di)

i1 D

P(vj | di)

i1 D

, Bik: presence of pk in the metadata of di

  • Measuring PAC-VAC Co-occurrences:
  • Predicting VACs for a given image by Bayes model:

Visual‐based detection score of pk in di VAC PAC image

slide-48
SLIDE 48

Application: Comment Assistant

48

lovely moody shot ‐ so peaceful!

B C …

PAC

PAC‐VAC Correlation Models PAC VAC

SentiBank Detection

“wonderful,” “lovely,” “nice,” “peaceful,” “moody,” “serene,”… Candidate Comment Collection Comment Selection Predicted VACs Comment Tutor

slide-49
SLIDE 49

Summary: Affect/Emotion Attributes

1200 Classifiers Predict Sentiment

yummy food

Publisher Affect Concept Viewer Affect Concept

great . . . now i'm

hungry . . .

  • Retrieval
  • Authoring
  • Recommendation
  • Social communication
slide-50
SLIDE 50

Beyond Semantics

Aesthetics Interestingness Emotion Style

Others:, Creativity, Intent, Memorable …

slide-51
SLIDE 51

References

Visual Affect

  • Machajdik, Jana, and Allan Hanbury. "Affective image classification using features inspired by psychology and art

theory." In ACM Multimedia, 2010. Visual Emotion and Sentiment

  • Borth, Damian, Rongrong Ji, Tao Chen, Thomas Breuel, and Shih‐Fu Chang. "Large‐scale visual sentiment ontology

and detectors using adjective noun pairs." In ACM Multimedia, 2013.

  • Chen, Yan‐Ying, Tao Chen, Winston H. Hsu, Hong‐Yuan Mark Liao, and Shih‐Fu Chang. "Predicting Viewer Affective

Comments Based on Image Content in Social Media." In ACM ICMR, 2014. Visual Aesthetics

  • Datta, Ritendra, Dhiraj Joshi, Jia Li, and James Z. Wang. "Studying aesthetics in photographic images using a

computational approach." In ECCV, 2006.

  • Naila Murray, De Barcelona, Luca Marchesotti, and Florent Perronnin. AVA: A Large‐Scale Database for Aesthetic

Visual Analysis. In CVPR, 2012. Visual Aesthetics, Interestingness, Memorability

  • S. Dhar, V. Ordonez, and T. L. Berg. High level describable attributes

for predicting aesthetics and interestingness. In CVPR, 2011.

  • Gygli, Michael, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, and Luc Van Gool. "The interestingness of

images." In ICCV, 2013.

  • Bhattacharya, Subhabrata, Behnaz Nojavanasghari, Tao Chen, Dong Liu, Shih‐Fu Chang, and Mubarak Shah.

"Towards a comprehensive computational model foraesthetic assessment of videos." In ACM Multimedia, 2013.

  • Isola, Phillip, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. "What makes an image memorable?." In CVPR,

2011. Image Style

  • Karayev, Sergey, Aaron Hertzmann, Holger Winnemoeller, Aseem Agarwala, and Trevor Darrell. "Recognizing Image

Style." arXiv preprint arXiv:1311.3715(2013).

slide-52
SLIDE 52

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications