Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum Max Planck - - PowerPoint PPT Presentation

sreyasi nag chowdhury niket tandon gerhard weikum
SMART_READER_LITE
LIVE PREVIEW

Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum Max Planck - - PowerPoint PPT Presentation

Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum Max Planck Institute for Informatics, Saarbrcken, Germany User Query Concrete Abstract Q: bicycle in street Q: environment friendly traffic Sreyasi Nag Chowdhury, AKBC 2016 17/06/2016 1


slide-1
SLIDE 1

Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum

Max Planck Institute for Informatics, Saarbrücken, Germany

slide-2
SLIDE 2

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract Q: bicycle in street Q: environment friendly traffic

slide-3
SLIDE 3

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!”

Q: bicycle in street Q: environment friendly traffic

slide-4
SLIDE 4

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!”

Text-only Q: bicycle in street Q: environment friendly traffic

slide-5
SLIDE 5

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual Q: bicycle in street Q: environment friendly traffic

slide-6
SLIDE 6

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

slide-7
SLIDE 7

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

Text-only Text + visual

slide-8
SLIDE 8

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual

“Riding for a cause.” Visual objects: person, bicycle CSK: (riding bicycle, be, environment friendly)

Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

Text-only Text + visual

slide-9
SLIDE 9

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual

“Riding for a cause.” Visual objects: person, bicycle

Text/visual Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

Text-only Text + visual

slide-10
SLIDE 10

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual

“Riding for a cause.” Visual objects: person, bicycle CSK: (riding bicycle, be, environment friendly)

Text/visual Text + visual + CSK Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

Text-only Text + visual

slide-11
SLIDE 11

17/06/2016 1 Sreyasi Nag Chowdhury, AKBC 2016

User Query Concrete Abstract

“Wow! Double-decker buses still run!” Visual objects: bicycle, bus, car

Text-only Text + visual

“Riding for a cause.” Visual objects: person, bicycle CSK: (riding bicycle, be, environment friendly)

Text/visual Text + visual + CSK Q: bicycle in street Q: environment friendly traffic

“Biking by the river” Visual objects: train, piano

Text-only Text + visual

Our contribution

slide-12
SLIDE 12
  • CSK: Where do we get it from?
  • CSK: How do we use it?
  • CSK: How to combine noisy signals?
  • CSK: Does it help?

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016 2

slide-13
SLIDE 13
  • Existing CSK knowledge bases: WordNet, ConceptNet, WebChild, Knowlywood

CSK: WHERE DO WE GET IT FROM?

3 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-14
SLIDE 14
  • Existing CSK knowledge bases: WordNet, ConceptNet, WebChild, Knowlywood
  • Our corpus: Wiki articles from domain ‘tourism’

CSK: WHERE DO WE GET IT FROM?

3 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-15
SLIDE 15
  • Existing CSK knowledge bases: WordNet, ConceptNet, WebChild, Knowlywood
  • Our corpus: Wiki articles from domain ‘tourism’
  • Pruned by Jaccard Similarity

CSK: WHERE DO WE GET IT FROM?

3 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-16
SLIDE 16
  • Existing CSK knowledge bases: WordNet, ConceptNet, WebChild, Knowlywood
  • Our corpus: Wiki articles from domain ‘tourism’
  • Pruned by Jaccard Similarity
  • ~22,000 CSK triples

“tourism” “be travel for” “recreation, leisure, family, business purposes” “people” “fall in” “love”

  • “the bloody hell” “be” “you”

CSK: WHERE DO WE GET IT FROM?

3

Domain-specific ReVerb triples

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-17
SLIDE 17

CSK: HOW DO WE USE IT?

  • Query string: travel with backpack
  • CSK to expand query
  • t1: (tourists, use, travel maps)
  • t2: (tourists, carry, backpack)
  • t3: (backpack, is a type of, bag)

4 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-18
SLIDE 18
  • Query string: travel with backpack
  • CSK to expand query
  • t1: (tourists, use, travel maps)
  • t2: (tourists, carry, backpack)
  • t3: (backpack, is a type of, bag)
  • Document x with features
  • Textual: “A tourist reading a map by the road”
  • Visual: person, bag, bottle, bus

Text-only systems Text + visual + CSK systems

4 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

slide-19
SLIDE 19
  • Query string: travel with backpack
  • CSK to expand query
  • t1: (tourists, use, travel maps)
  • t2: (tourists, carry, backpack)
  • t3: (backpack, is a type of, bag)
  • Document x with features
  • Textual: “A tourist reading a map by the road”
  • Visual: person, bag, bottle, bus

Text-only systems Text + visual + CSK systems

4

 CSK bridge vocabulary gap between query and document  CSK establish relations between concepts  CSK diminish noise from modalities – ensemble effect

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

slide-20
SLIDE 20

5

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

Document x

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

slide-21
SLIDE 21

5

Document x Textual features xx

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

slide-22
SLIDE 22

5

Document x Textual features xx

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

Visual features xv : backpack, person

slide-23
SLIDE 23

5

Document x Textual features xx

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

Visual features xv : backpack, bag, container person, casual agent, organism

slide-24
SLIDE 24

5

Document x Textual features xx

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

Visual features xv : backpack, bag, container person, casual agent, organism Query: “group excursion” Query expansion: (an excursion, be trip by, a group of people) (organized excursions, book through, a tour company) CSK features

slide-25
SLIDE 25

A tour group is standing on the grass with ruins in the background. Group of people standing in front of a stone structure.

5 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: HOW DO WE USE IT?

Visual features xv : backpack, bag, container person, casual agent, organism Query: “group excursion” Query expansion: (an excursion, be trip by, a group of people) (organized excursions, book through, a tour company)

slide-26
SLIDE 26

CSK: HOW TO COMBINE NOISY SIGNALS?

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-27
SLIDE 27

CSK: HOW TO COMBINE NOISY SIGNALS?

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-28
SLIDE 28

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK triple

CSK: HOW TO COMBINE NOISY SIGNALS?

slide-29
SLIDE 29

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Probabilities based on word-wise overlaps

CSK: HOW TO COMBINE NOISY SIGNALS?

slide-30
SLIDE 30

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Background corpus – Co-occurring Flickr tags

CSK: HOW TO COMBINE NOISY SIGNALS?

slide-31
SLIDE 31

6

  • Mixture LM:
  • Commonsense-aware LM:
  • Smoothed LM: , where
  • Basic LM: , where

17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Textual and visual features

CSK: HOW TO COMBINE NOISY SIGNALS?

slide-32
SLIDE 32

7 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

  • Image Dataset
  • Flickr30k
  • MS COCO captioned dataset
  • Pascal Sentence Dataset
  • SBU captioned dataset
slide-33
SLIDE 33

7 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

  • Image Dataset
  • Flickr30k
  • MS COCO captioned dataset
  • Pascal Sentence Dataset
  • SBU captioned dataset

Boat trip to see the mythical pink dolphins... this is John checking in with the office for that day.

slide-34
SLIDE 34

7 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

  • Image Dataset
  • Flickr30k
  • MS COCO captioned dataset
  • Pascal Sentence Dataset
  • SBU captioned dataset

A group of tourists is crossing a bridge that connects a walking path to a trail of nature. Many people cross a very tall footbridge with a tree-covered hill in the background. This shows a group of people walking over an arched red bridge. People cross a large bridge to get over the body of water. People walking over a white and red bridge over a pond. Boat trip to see the mythical pink dolphins... this is John checking in with the office for that day.

slide-35
SLIDE 35

7 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

  • Image Dataset
  • Flickr30k
  • MS COCO captioned dataset
  • Pascal Sentence Dataset
  • SBU captioned dataset

A group of tourists is crossing a bridge that connects a walking path to a trail of nature. Many people cross a very tall footbridge with a tree-covered hill in the background. This shows a group of people walking over an arched red bridge. People cross a large bridge to get over the body of water. People walking over a white and red bridge over a pond. Boat trip to see the mythical pink dolphins... this is John checking in with the office for that day.

social media post blog post

slide-36
SLIDE 36

7 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

  • Image Dataset
  • Flickr30k
  • MS COCO captioned dataset
  • Pascal Sentence Dataset
  • SBU captioned dataset
  • ~ 50,000 images with captions

A group of tourists is crossing a bridge that connects a walking path to a trail of nature. Many people cross a very tall footbridge with a tree-covered hill in the background. This shows a group of people walking over an arched red bridge. People cross a large bridge to get over the body of water. People walking over a white and red bridge over a pond. Boat trip to see the mythical pink dolphins... this is John checking in with the office for that day.

social media post blog post

slide-37
SLIDE 37
  • Baselines: Text-only and Text + Visual search approaches
  • Evaluation metric: Average Precision @ 10

8 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

slide-38
SLIDE 38
  • Baselines: Text-only and Text + Visual search approaches
  • Evaluation metric: Average Precision @ 10

8 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

47% 64% 85% Text-only Text + Visual Text + Visual + CSK (Know2Look)

slide-39
SLIDE 39
  • Baselines: Text-only and Text + Visual search approaches
  • Evaluation metric: Average Precision @ 10
  • Examples queries:
  • Concrete – ball park, bridge road, table home, bicycle road
  • Abstract – diesel transport, housing town
  • Mixed – old clock, backpack travel, boat tour

8 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

47% 64% 85% Text-only Text + Visual Text + Visual + CSK (Know2Look)

slide-40
SLIDE 40
  • Baselines: Text-only and Text + Visual search approaches
  • Evaluation metric: Average Precision @ 10
  • Examples queries:
  • Concrete – ball park, bridge road, table home, bicycle road
  • Abstract – diesel transport, housing town
  • Mixed – old clock, backpack travel, boat tour

8 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

CSK: DOES IT HELP?

47% 64% 85% Text-only Text + Visual Text + Visual + CSK (Know2Look) Co-occurring Flickr tags

slide-41
SLIDE 41

9 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Text-only Text + Visual Text + Visual + CSK (Know2Look)

Query: “group excursion”

xxj: “A small excursion boat anchored

  • n the beach at the resort in Mexico. "

xvj: lunar excursion module, conveyance xxj: “A group of people riding camels.“ yk: (an excursion, be trip by, a group of people)

CSK: DOES IT HELP?

slide-42
SLIDE 42

9 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Text-only Text + Visual Text + Visual + CSK (Know2Look)

Query: “group excursion”

xxj: “A small excursion boat anchored

  • n the beach at the resort in Mexico. "

xvj: lunar excursion module, conveyance xxj: “A group of people riding camels.“ yk: (an excursion, be trip by, a group of people)

CSK: DOES IT HELP?

slide-43
SLIDE 43
  • Noisy OpenIE triples capture commonsense knowledge
  • Noisy textual cues + noisy visual object detection + noisy commonsense knowledge 

ensemble effect  better results for multimodal document retrieval

  • CSK act as bridge between text and vision

10 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-44
SLIDE 44
  • Noisy OpenIE triples capture commonsense knowledge
  • Noisy textual cues + noisy visual object detection + noisy commonsense knowledge 

ensemble effect  better results for multimodal document retrieval

  • CSK act as bridge between text and vision
  • Do word co-occurrences or word embeddings provide similar results?
  • Does structured commonsense knowledge improve retrieval?

10 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

slide-45
SLIDE 45
  • Noisy OpenIE triples capture commonsense knowledge
  • Noisy textual cues + noisy visual object detection + noisy commonsense knowledge 

ensemble effect  better results for multimodal document retrieval

  • CSK act as bridge between text and vision
  • Do word co-occurrences or word embeddings provide similar results?
  • Does structured commonsense knowledge improve retrieval?

10 17/06/2016 Sreyasi Nag Chowdhury, AKBC 2016

Thank you