Multiple Nested Reductions of Single Data Modes as a Tool to Deal - PowerPoint PPT Presentation

Multiple Nested Reductions of Single Data Modes as a Tool to Deal with Large Data Sets Iven Van Mechelen and Katrijn Van Deun K.U.Leuven Psychology Department and Center for Computational Systems Biology Invited IFCS session at COMPSTAT 2010

Overview: • introduction • principles • example 1: existing model • example 2: novel model • discussion 2

Introduction • in many research areas: - accessibility of novel measurement technologies - data tsunami: highdimensional data sets - example: various types of ‘omics’ data 4

Introduction • in many research areas: - accessibility of novel measurement technologies - data tsunami: highdimensional data sets - example: various types of ‘omics’ data 5

Introduction • in many research areas: - accessibility of novel measurement technologies - data tsunami: highdimensional data sets - example: various types of ‘omics’ data • concerted use of technologies in many settings - data sets with large number of experimental units 6

Introduction (ctd) • problems: 7

Introduction (ctd) • problems: - redundancies, dependencies, ill-conditioned optimization problems 8

Introduction (ctd) • problems: - redundancies, dependencies, ill-conditioned optimization problems - computational bottlenecks 9

Introduction (ctd) • problems: - redundancies, dependencies, ill-conditioned optimization problems - computational bottlenecks - displaying output prohibitive 10

Introduction (ctd) • possible solution: classical reduction methods (categorical: clustering; continuous: dimension reduction) 11

Introduction (ctd) • possible solution: classical reduction methods (categorical: clustering; continuous: dimension reduction) • however: often breakdown of such methods … 12

Introduction (ctd) • possible solution: classical reduction methods (categorical: clustering; continuous: dimension reduction) • however: often breakdown of such methods … • possible rescue missions: variable selection, sparseness penalty or constraints, … 13

Introduction (ctd) • possible solution: classical reduction methods (categorical: clustering; continuous: dimension reduction) • however: often breakdown of such methods … • possible rescue missions: variable selection, sparseness penalty or constraints, … • alternative solution: multiple nested reductions of single data modes (within framework of global model for data, fitted with a simultaneous optimization procedure) 14

Principles data: I × J object by variable (e.g., tissue by gene) data • matrix D variable mode j …...... object mode …….... d ij i 16

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A examples: 17

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A examples: Tissue 1 1 0 0 Tissue 2 1 0 0 Tissue 3 0 0 1 Tissue 4 0 0 1 Tissue 5 0 1 0 ... 18

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A examples: Tissue 1 1 1 0 Tissue 2 1 1 0 Tissue 3 1 0 1 Tissue 4 1 0 1 Tissue 5 1 0 1 ... 19

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A examples: Tissue 1 3.2 5.2 5.1 Tissue 2 4.1 -6.7 3.4 Tissue 3 5.8 3.9 1.9 Tissue 4 1.0 -2.1 0.5 Tissue 5 -2.3 8.0 -1.7 ... 20

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A - reduction of variable (gene) mode by means of (binary or real-valued) J × Q quantification matrix B 21

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A - reduction of variable (gene) mode by means of (binary or real-valued) J × Q quantification matrix B P × Q core matrix W - 22

Principles (ctd) • (deterministic core of) generic decomposition model (Van Mechelen & Schepers, 2007): - reduction of object (tissue) mode by means of (binary or real-valued) I × P quantification matrix A - reduction of variable (gene) mode by means of (binary or real-valued) J × Q quantification matrix B P × Q core matrix W - - decomposition operator f , which is such that: ( ) = + , , D f A B W E with f ( A , B , W ) ij only depending on A i ⋅ and B j ⋅ 23

Principles (ctd) ( ) = + , , D f A B W E • special cases: 24

Principles (ctd) ( ) = + , , D f A B W E • special cases: - A and B binary, f additive operator: ( ) = , , t f A B W A WB = ∑∑ P Q ( ) , , f A B W a b w i p jq p q i j = = 1 1 p q (general additive two-mode clustering model) 25

= ∑∑ P Q ( ) , , f A B W a b w i p jq p q i j = = 1 1 p q V 1 V 2 V 3 V 4 V 5 V 6 V 7 A • A • 2 1 0 0 0 0 0 0 0 0 0 O 1 O 1 0 2 2 2 0 0 0 1 0 O 2 O 2 A 1 0 0 2 2 2 0 0 0 O 3 O 3 1 1 0 2 2 5 3 3 0 O 4 O 4 0 1 0 0 0 3 3 3 0 O 5 O 5 0 0 0 0 0 0 0 0 0 O 6 O 6 2 0 B • B • 0 1 1 1 0 0 0 1 1 W B • B 0 0 0 1 1 1 0 B • 0 3 2 2 A • A • V 1 V 2 V 3 V 4 V 5 V 6 V 7 1 2 26

Principles (ctd) ( ) = + , , D f A B W E • special cases (ctd): - A and B real-valued, W identity matrix, f additive operator: ( ) = , , t f A B W AB = ∑ P ( ) , , f A B W a p b i jp ij = 1 p (principal component analysis) 27

Principles (ctd) ( ) = + , , D f A B W E • special cases (ctd): - A and B real-valued, W identity matrix, f Euclidean distance-based operator: 1 ⎡ ⎤ ( ) 2 P ( ) ∑ 2 = − , , ⎢ ⎥ f A B W a b ip jp ij ⎣ ⎦ = 1 p (multidimensional unfolding) 28

Principles (ctd) ( ) = + , , D f A B W E • multiple nested reductions: - decomposition of core matrix W : ( ) = * * * , , * W f A B W and therefore: ( ) ( ) = + * * * , , , , * D f A B f A B W E with A * denoting a P × P* quantification matrix, B * a Q × Q* quantification matrix, f * a decomposition operator, and with f* ( A *, B *, W *) pq only depending on A * p ⋅ and B * q ⋅ 29

Principles (ctd) ( ) ( ) = + * * * , , , , * D f A B f A B W E • remarks: - each of the quantification matrices ( A , A *, B , B *) can be an identity matrix (no reduction), a binary matrix (categorical, cluster-based reduction), or a real- valued matrix (continuous, dimension reduction) - model is to be estimated as a whole, making use of one overall objective or loss function (unlike in ‘ tandem ’ approaches) 30

Example 1: Existing model ( ) ( ) = + * * * , , , , * D f A B f A B W E • two-mode unfolding clustering: - A and B binary partition matrices, f additive operator (i.e., outer model = two-mode partitioning) - A* and B* real-valued matrices, W * identity matrix, f Euclidean-distance based operator (i.e., inner model = multidimensional unfolding) ⎡ ⎤ 1 ⎡ ⎤ * P Q P ( ) 2 ⎢ ⎥ ∑∑ ∑ 2 = ∗ − ∗ + ⎢ ⎥ d a b a b e ⎢ ⎥ * * ij i p jq p p qp ij ⎣ ⎦ = = = 1 1 * 1 ⎢ p q p ⎥ ⎣ ⎦ 32

Example 1: Existing model (ctd) ⎡ ⎤ 1 ⎡ ⎤ * P Q P ( ) 2 ⎢ ⎥ ∑∑ ∑ 2 = ∗ − ∗ + ⎢ ⎥ d a b a b e ⎢ ⎥ * * ij i p jq p p qp ij ⎣ ⎦ = = = 1 1 * 1 ⎢ ⎥ p q p ⎣ ⎦ • two-mode unfolding clustering: (ctd) - originally proposed (in deterministic form) by Van Mechelen & Schepers (2007) - stochastic variant (making use of double mixture approach) proposed by Vera, Macías & Heiser (2009) under the name dual latent class unfolding - special case: A or B identity matrix (outer categorical reduction of one mode only): latent class unfolding as proposed by De Soete & Heiser (1993) 33

Example 1: Existing model (ctd) • application (Vera et al.): respondent by statement on internet use 34

Multiple Nested Reductions of Single Data Modes as a Tool to Deal - PowerPoint PPT Presentation

Multiple Nested Reductions of Single Data Modes as a Tool to Deal with Large Data Sets Iven Van Mechelen and Katrijn Van Deun K.U.Leuven Psychology Department and Center for Computational Systems Biology Invited IFCS session at COMPSTAT 2010

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested Words Theoretically and

CS 301 Lecture 20 Reductions Stephen Checkoway April 9, 2018 1 / 17 Reductions Reductions

Polynomial-time reductions We have seen several reductions: Polynomial-time reductions Informal

Nested Transactions Nested Transactions Flat transactions The rules for committing of

Nested and Composite Classes Lecture 14 COP 3252 Summer 2017 May 30, 2017 Nested Classes

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

SynAthina Onli line Tools 1. . A mapping tool 2. A Community Tool 3. An Archive Tool 3. An

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Scalable Laplacian K-modes Imtiaz Masud Ziko, Eric Granger and Ismail Ben Ayed Laplacian K-modes

Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single

6 Subsequences and sequential compactness 6.1 Nested intervals and nested d -cells Recall the

NEVE: Nested Virtualization Extensions for ARM Jin Tack Lim, Christo ff er Dall, Shih-Wei Li, Jason

Single Single- -Thread NVE Thread NVE Multiple Subsystems, Multiple Threads Multiple

Nested Loops Plan for today Green Screen Single looping: a deeper look Nested looping Drawing

Recommended Round 2 March Budget Reductions GENERAL FUND SUMMARY TOTAL REDUCTIONS ROUNDS

E-Beam technology for nested pre Beam technology for nested pre-filled filled syringe tub de

ASCO Os Pay ayment ment Ref efor orm m Model odel Washington State Medical Oncology

Causal Inference and Stable Learning Peng Cui Tong Zhang Tsinghua University Hong Kong

ROP78 Cancer Care, Survivorship, Pain Control and Palliative Care Thyroid Cancer Survivors

scRNA-seq Differential expression analyses Olga Dethlefsen olga.dethlefsen@nbis.se NBIS,

JUST THE MATHS SLIDES NUMBER 10.3 DIFFERENTIATION 3 (Elementary techniques of

What Can It Look Like in the Science Classroom? Jeremy Peacock, Science Northeast Georgia RESA

MATH 12002 - CALCULUS I 2.6: Implicit Differentiation Professor Donald L. White Department of

Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota