Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 - - PDF document

learning in parallel universes bernd wiswedel
SMART_READER_LITE
LIVE PREVIEW

Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 - - PDF document

Ny Nyco come med Chair d Chair for Bioinf nforma ormati tics & cs & Information M on Mining Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 Overview What are Parallel Universes? Application Scenarios


slide-1
SLIDE 1

1

Ny Nyco come med Chair d Chair for Bioinf nforma

  • rmati

tics & cs & Information M

  • n Mining

Learning in Parallel Universes Bernd Wiswedel

15 September, 2008

Overview

  • What are Parallel Universes?
  • Application Scenarios
  • One sample approach: Neighborgrams
  • Connection to LeGo
  • Connection to LeGo

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#2

slide-2
SLIDE 2

2

Motivation

  • Data Mining as application to analyse huge amounts of data
  • One focus of Data Mining: Find interesting patterns in a data

set, e.g. cluster

  • Often data very complex sometimes multiple

Often data very complex, sometimes multiple representations of data available Parallel Universes

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#3

What are Parallel Universes?

  • Usually: Data given in a single feature space

– Mostly high-dimensional and numeric representation

… …

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#4

– Definition of one, global distance measure

slide-3
SLIDE 3

3

What are Parallel Universes?

  • Parallel Universes

– Different object representations

… … … …

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#5

– Different distance measures

Why Parallel Universes?

  • Example 1: Chemistry - universes encode, e.g.

– shape (3D) – graph structure ti – properties…

… … … …

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#6

… … … …

see also: A. Bender, R. Glen: Molecular similarity: a key technique in molecular informatics,

  • Org. Biomol. Chem., 2:3204-3218, 2004
slide-4
SLIDE 4

4

Why Parallel Universes?

  • Example 2: Web - universes encode, e.g.

– link structure – meta information (categories, tags) t t (b f d ) – content (bag of words…)

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#7

… … … … … … Why Parallel Universes?

  • More examples:

– Music - universes encode

  • semantic meta information (composer, artist, genre,…)
  • groupings (style category

)

  • groupings (style, category,…)
  • other properties (tempo, beat, key, …)

– Image or 3D object recognition – universes encode

  • properties (has door, has wheels…)
  • texture information
  • histogram or intensity/color distributions

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#8

slide-5
SLIDE 5

5

Learning in Parallel Universes

  • Naive Approach:

– Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: ft i ibl i t d tif t

  • ften impossible, introduces artifacts.
  • Better:

– Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#9

… … … … … … Learning in Parallel Universes

  • Naive Approach:

– Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: ft i ibl i t d tif t

  • ften impossible, introduces artifacts.
  • Better:

– Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#10

… … … … … …

slide-6
SLIDE 6

6

Learning in Parallel Universes

  • Naive Approach:

– Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: ft i ibl i t d tif t

  • ften impossible, introduces artifacts.
  • Better:

– Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#11

… … … … … … Related Approaches: Subspace Clustering

  • choose subset of data and attributes for each cluster

– usually no interpretation of subspaces possible – selects from one, large universe – first finds also overlapping clusters – most prominent approaches: CLIQUE, COSA

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#12

… … … … … …

slide-7
SLIDE 7

7

Related Approaches: Multi-Instance Learning

  • each object has several possible representations in same

space (e.g. molecular confirmations in 3D)

– universes all possess the same semantics – two extremes: similar in all universes, similar in at least one universe. – number of universes per object can vary.

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#13

… … … … … … Related Approaches: Multi-View Learning

  • each object has several possible representations

in different spaces

– universes with different semantics – independent and complete models in each universe (learning algorithms may assist each other) (learning algorithms may assist each other)

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#14

… … … … … …

slide-8
SLIDE 8

8

Learning in Parallel Universes

  • Clear separation of Universes (a-priori given)
  • Each individual universe does not suffice for learning
  • Allow to identify (local) models that occur only in few (one)

universes universes

  • Identify overlaps

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#15

… … … … … … One sample approach: Neighborgrams

  • Supervised approach
  • Construct local neighorhood histogram

(„Neighborgrams“) for objects of interest in all i universes

  • Derive quality values for individual neighborgrams
  • Covering-like approach to construct classification

model

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#16

  • Intuitive visualization allows for interactive

exploration and user-controlled model construction

slide-9
SLIDE 9

9

Neighborgrams on KN-DB

best (fuzzy) cluster suggested by More Neighbors

  • f same class

The remainder of the 100 closest neighbors interactive clustering algorithm.

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#17

Centroid of Neighborgram Closest neighbor at distance d (same class as centroid)

Neighborgrams on KN-DB

First universe (Image Based) Second universe (Surface Based) Third universe (Volume Based)

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#18

slide-10
SLIDE 10

10

Neighborgrams on KN-DB

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#19

Neighborgrams on KN-DB

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#20

slide-11
SLIDE 11

11

Neighborgrams on KN-DB

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#21

Neighborgrams on KN-DB Neighborgrams on KN-DB

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#22

slide-12
SLIDE 12

12

Summary Neighborgrams

  • Visualization tool for interactive exploration of clusters
  • Works well for small size data sets or to model minority class
  • Manual clustering
  • Semi Automatic clustering
  • Semi-Automatic clustering

– Inspect proposed cluster – Discard, accept or fine-tune cluster

  • Fully automatic clustering

– Sequential covering approach – Identify greedily the next best cluster, remove covered objects, restart

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#23

Connection to LeGo

  • Output is selected set of Neighborgram Clusters,

spread over different universes

  • Such clusters can be considered as local patterns
  • Open problem: Construction of a global model as
  • pposed to a simply aggregation of clusters
  • Special focus on identifying overlaps among

universes (often of special interest)

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#24

slide-13
SLIDE 13

13

Summary

  • Learning in Parallel Universes as simultaneous

analysis of multiple descriptor spaces

  • Encompasses identification of patterns that:

– are specific to individual universes and – span multiple universes (not necessarily all)

  • Final model construction comprises all previously

identified patterns

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#25

Thanks!

15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes"

#26