Learning Data Representations: Hierarchies and Invariance Joachim M. - - PowerPoint PPT Presentation

learning data representations hierarchies and invariance
SMART_READER_LITE
LIVE PREVIEW

Learning Data Representations: Hierarchies and Invariance Joachim M. - - PowerPoint PPT Presentation

Learning Data Representations: Hierarchies and Invariance Joachim M. Buhmann Computer Science Department, ETH Zurich 23 November 2013 Value chain of IT: Personalized Medicine Activation of the mTOR Signaling Pathway in Renal Clear Cell


slide-1
SLIDE 1

23 November 2013

Learning Data Representations: Hierarchies and Invariance

Joachim M. Buhmann Computer Science Department, ETH Zurich

slide-2
SLIDE 2

23 Nov 2013 Joachim M. Buhmann MIT Workshop 2

Value chain of IT: Personalized Medicine

my Data my Information my Knowledge

Activation of the mTOR Signaling Pathway in Renal Clear Cell Carcinoma. Robb et al., J Urology 177:346 (2007)

my Value

happy (alive) patients

slide-3
SLIDE 3

23 Nov 2013 Joachim M. Buhmann MIT Workshop 3

Learning features and representations

§ What are representations good for?

§ Task specific data reduction § Decision making § Efficient computation

§ Unfavorable properties of representations

§ Strongly statistically dependent features

DKL⇣ p(x1, . . . , xnk Q

i p(xi)

difficult to estimate hard to compute easy to estimate simple to compute

slide-4
SLIDE 4

23 Nov 2013 Joachim M. Buhmann MIT Workshop 4

Design principles for representations

§ Decoupling (statistical & computational)

find epistemic atoms (symbols), e.g., grandmother cells Example: chain of boolean variables Consider

1 1

ξk =

n

X

i=1

(2xi − 1) exp(ik2π/n)

x1 ∈ {0, 1}

slide-5
SLIDE 5

23 Nov 2013 Joachim M. Buhmann MIT Workshop 5

Design principles for representations (cont.)

§ Conditional decoupling

§ Infer tree structures § Modular structures

§ Latent variable discovery

K-means: sum of average cluster distortions = sum of average pairwise distances

slide-6
SLIDE 6

23 Nov 2013 Joachim M. Buhmann MIT Workshop 6

Challenge for learning representations

§ Learning representations explores the space of

structures

§ Combinatorial search in spaces with § Data adaptive coarsening is required, i.e., in the

asymptotic limit we derive a distribution over structures and not a single best one. Current learning theory is insufficient to handle this constraint! => Information / rate distortion theory

dimVC(∞)

slide-7
SLIDE 7

23 Nov 2013 Joachim M. Buhmann MIT Workshop 7

Goal: Theory for learning algorithms

§ Modeling in pattern recognition requires

§ quantization: given identify a set of good hypotheses, § learning:

find an that specifies an informative set!

1 2 3 4 6 5 9 7 8 10 11 12 1 2 3 4 6 5 9 7 8 10 11 12 1 2 3 4 6 5 9 7 8 10 11 12

Algorithm

A

A A

slide-8
SLIDE 8

23 Nov 2013 Joachim M. Buhmann MIT Workshop 8

Low-Energy Computing

§ Novel low-power architectures operate

near transistor threshold voltage (NTV)

§ e.g., Intel Claremont § 1.5 mW @10 MHz (x86)

§ NTV promises 10x more

energy efficiency at 10x more parallelism!

§ 105 times more soft errors (bits flip stochastically) § Hard to correct in hardware à expose to programmer?

source: Intel

@ Thorsten Höffler