Medical visual information retrieval Henning Mller HES-SO//Valais - - PowerPoint PPT Presentation

medical visual information retrieval
SMART_READER_LITE
LIVE PREVIEW

Medical visual information retrieval Henning Mller HES-SO//Valais - - PowerPoint PPT Presentation

Medical visual information retrieval Henning Mller HES-SO//Valais Sierre, Switzerland Overview Motivation for visual retrieval Differences between text and visual search A model for CBIR Main techniques Context vs. content


slide-1
SLIDE 1

Medical visual information retrieval

Henning Müller HES-SO//Valais Sierre, Switzerland

slide-2
SLIDE 2

Overview

  • Motivation for visual retrieval

– Differences between text and visual search – A model for CBIR

  • Main techniques

– Context vs. content – Detection vs. retrieval – Semantics vs. free text

  • Current example systems

– Springer images, Goldminer, MedSearch, Yottalook

  • Upcoming challenges
  • Conclusions
slide-3
SLIDE 3

Motivation

  • Enormous amounts of images are

being produced digitally and are available for analysis

– Part of patient records – Part of scientific articles, videos in addition to content

  • New machines and protocols produce an increasing

variety of image types that require explanations

  • Images are needed and often require visualization

– Thin slices, reconstruction in the brain of clinicians

  • Annotated data can be used to train detection

– Data from detection can then also be used for retrieval

slide-4
SLIDE 4

Differences text and visual search

  • Text contains semantic information directly

– … mapping on controlled vocabularies may help retrieval

  • Visual features are fairly low-level

– Colors, textures, basic shapes – Automatic segmentation is an ill-posed problem – Visual words can help solving some of the problems – Simple mapping to terminologies can also help

  • Annotation of images improves retrieval quality

– But most often annotations is not of the images but rather puts the images into a specific context

slide-5
SLIDE 5

Content-based image retrieval

Feature 1 0.5 Feature 2 0.7 Feature 15 0.1 Feature 25 0.3 …

Similarity calculation Relevance feedback Feature extraction Image feature Database

?

slide-6
SLIDE 6

Content vs. context

  • Visual features

describe global or regional content

– Annotation of the content exists only rarely

  • Clinical records and articles often describe small

parts of the content and rather put them in context

– Which were the problems? – Why was the image taken? – Age of patient, anamnesis, …

  • A partial description of the ROI is often available
  • Figure captions describe also part of the content
  • Both content and context are complementary
slide-7
SLIDE 7

Detection vs. retrieval

  • Detection

– Classification of small anomalies in usually a region of interest (ROI) or globally in the image – Annotated training data is necessary – Interpretation needs to be given, so the system is not a black box

  • Retrieval

– Training data is not necessary, or not available – Search for similarity

  • For clearly defined concepts or visual data
  • Can be seen as a classification relevant/non-relevant
slide-8
SLIDE 8

Visual words

Salient regions All pixes, grid, high gradient Original feature space
  • N-dimensional feature vector for
each salient region Quantization of the feature space
  • Division of the feature space into
M groups: visual words
  • Clustering (k-means)
  • Cluster centers are the words
Visual words space
  • Optimal number of words needs
to be found
  • For each image a histogram can
be created
  • Analogy to text words
Bag of Visual Words
  • Histogram of words for an image
  • r an image region based of the
salient points
  • M-Dimensional feature vector
slide-9
SLIDE 9

Semantics vs. free text

  • Free text is also treated before retrieval

– Stemming, stop word removal, phonetic retrieval – Translation is possible but sometimes hard – All data are readily available, little treatment

  • Semantic data

– Extracted from text

  • Probability to be true, relatively high for a few terms in

medical texts

– Synonyms, hypernyms, hyponyms can then be used – Existing ontologies are available (LinkedLifeData) – Understanding what particular parts are about

slide-10
SLIDE 10

Goldminer.arrs.org (249,000 images)

slide-11
SLIDE 11

www.springerimages.com (3.3 mio images)

slide-12
SLIDE 12

medgift.hevs.ch/demos/ (231,000 images)

slide-13
SLIDE 13

www.yottalook.com (70,000 images)

slide-14
SLIDE 14

Upcoming challenges

  • Develop simple interfaces for many types of

browsing and limiting the results set

– Faceted browsing, interactive changes, propositions – Also on small screens for mobile devices – Include 3D and 4D data

  • Linking visual retrieval/detection with semantic data

– Better understanding visual data and its context – Filtering out unwanted information

  • Finding small ROIs for an in depth analysis
  • Treating extremely large databases
  • Quickly changing modalities (limited training data)
slide-15
SLIDE 15

Regions of interest are often small

slide-16
SLIDE 16

Conclusions

  • Images are still not fully integrated into retrieval

– Not only in the medical field – Mobile devices and detection techniques change this – Visual data is analyzed much faster than text

  • Regions of interest are often very small

– Context is needed to find and analyze them – Interactivity can help as well

  • Many new tools are becoming available

– LinkedLife data and semantics are required for images – Many publishers are now making also images available as these are valuable

slide-17
SLIDE 17

Questions

  • ?

http://www.khresmoi.eu/ http://medgift.hevs.ch/