learning in parallel universes bernd wiswedel
play

Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 - PDF document

Ny Nyco come med Chair d Chair for Bioinf nforma ormati tics & cs & Information M on Mining Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 Overview What are Parallel Universes? Application Scenarios


  1. Ny Nyco come med Chair d Chair for Bioinf nforma ormati tics & cs & Information M on Mining Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 Overview • What are Parallel Universes? • Application Scenarios • One sample approach: Neighborgrams • Connection to LeGo • Connection to LeGo 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #2 1

  2. Motivation • Data Mining as application to analyse huge amounts of data • One focus of Data Mining: Find interesting patterns in a data set, e.g. cluster • Often data very complex sometimes multiple Often data very complex, sometimes multiple representations of data available � Parallel Universes 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #3 What are Parallel Universes? • Usually: Data given in a single feature space – Mostly high-dimensional and numeric representation … … – Definition of one, global distance measure 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #4 2

  3. What are Parallel Universes? • Parallel Universes – Different object representations … … … … … – Different distance measures 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #5 Why Parallel Universes? • Example 1: Chemistry - universes encode, e.g. – shape (3D) – graph structure – properties… ti … … … … … … … … see also: A. Bender, R. Glen: Molecular similarity: a key technique in molecular informatics , Org. Biomol. Chem., 2:3204-3218, 2004 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #6 3

  4. Why Parallel Universes? • Example 2: Web - universes encode, e.g. – link structure – meta information (categories, tags) – content (bag of words…) t t (b f d ) … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #7 Why Parallel Universes? • More examples: – Music - universes encode • semantic meta information (composer, artist, genre,…) • groupings (style category • groupings (style, category,…) ) • other properties (tempo, beat, key, …) – Image or 3D object recognition – universes encode • properties (has door, has wheels…) • texture information • histogram or intensity/color distributions 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #8 4

  5. Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #9 Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #10 5

  6. Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #11 Related Approaches: Subspace Clustering • choose subset of data and attributes for each cluster – usually no interpretation of subspaces possible – selects from one, large universe – first finds also overlapping clusters – most prominent approaches: CLIQUE, COSA … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #12 6

  7. Related Approaches: Multi-Instance Learning • each object has several possible representations in same space (e.g. molecular confirmations in 3D) – universes all possess the same semantics – two extremes: similar in all universes, similar in at least one universe. – number of universes per object can vary. … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #13 Related Approaches: Multi-View Learning • each object has several possible representations in different spaces – universes with different semantics – independent and complete models in each universe (learning algorithms may assist each other) (learning algorithms may assist each other) … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #14 7

  8. Learning in Parallel Universes • Clear separation of Universes (a-priori given) • Each individual universe does not suffice for learning • Allow to identify (local) models that occur only in few (one) universes universes • Identify overlaps … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #15 One sample approach: Neighborgrams • Supervised approach • Construct local neighorhood histogram („Neighborgrams“) for objects of interest in all universes i • Derive quality values for individual neighborgrams • Covering-like approach to construct classification model • Intuitive visualization allows for interactive exploration and user-controlled model construction 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #16 8

  9. Neighborgrams on KN-DB best (fuzzy) cluster suggested by interactive clustering More Neighbors algorithm. of same class The remainder of the 100 closest neighbors Centroid of Closest neighbor at distance d Neighborgram (same class as centroid) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #17 Neighborgrams on KN-DB First universe (Image Based) Second universe Third universe (Surface Based) (Volume Based) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #18 9

  10. Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #19 Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #20 10

  11. Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #21 Neighborgrams on KN-DB Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #22 11

  12. Summary Neighborgrams • Visualization tool for interactive exploration of clusters • Works well for small size data sets or to model minority class • Manual clustering • Semi Automatic clustering • Semi-Automatic clustering – Inspect proposed cluster – Discard, accept or fine-tune cluster • Fully automatic clustering – Sequential covering approach – Identify greedily the next best cluster, remove covered objects, restart 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #23 Connection to LeGo • Output is selected set of Neighborgram Clusters, spread over different universes • Such clusters can be considered as local patterns • Open problem: Construction of a global model as opposed to a simply aggregation of clusters • Special focus on identifying overlaps among universes (often of special interest) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #24 12

  13. Summary • Learning in Parallel Universes as simultaneous analysis of multiple descriptor spaces • Encompasses identification of patterns that: – are specific to individual universes and – span multiple universes (not necessarily all) • Final model construction comprises all previously identified patterns Thanks! 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #25 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #26 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend