From Yates Algorithm to Multidimensional Smoothing with Plan of - PowerPoint PPT Presentation

From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM • Classical regression • Factorial designs and Yates algorithm Iain Currie • Yates algorithm as an array computation Heriot Watt University • Generalised linear models • Array regression with • Examples Themes James Kirkby, Maria Durban & Paul Eilers • Structure of data and models • Algorithms University of Edinburgh December, 2006 1 2 Classical regression Classical regression Structure Universal recipe • Data: a data vector y , n × 1 Fit <- lm(y ∼ X - 1) • Model: a model matrix X , n × c a parameter vector θ , c × 1 µ = E ( y ) = Xθ A model algebra • Error distribution: normal Fit <- lm(y ∼ x + x 2 ) Fit <- lm(y ∼ A + B + A : B) Algorithm Fit <- lm(y ∼ x + x 2 + A ∗ B) • Normal equations X ′ X ˆ θ = X ′ y 3 4

Factorial designs Analysis in R Example: 2 3 design, John & Quenoullie, p64 Fit <- lm(y ∼ N ∗ P ∗ K) X <- model.matrix(y ∼ N ∗ P ∗ K) Treatment Yield Effect Estimate*8 1 N P NP K NK PK NPK 1 94.1 1 867.6 1 -1 -1 1 -1 1 1 -1 n 107.5 N 68.8 1 1 -1 -1 -1 -1 1 1 1 -1 1 -1 -1 1 -1 1 p 97.0 P 24.8 1 1 1 1 -1 -1 -1 -1 np 113.5 NP -10.0 1 -1 -1 1 1 -1 -1 1 k 96.9 K 43.4 1 1 -1 -1 1 1 -1 -1 nk 122.9 NK 9.0 1 -1 1 -1 1 -1 1 -1 pk 111.4 PK 7.0 1 1 1 1 1 1 1 1 npk 124.3 NPK -16.2 X ′ X = 8 I 8 Model ⇒ ˆ 1 = 8 X ′ y θ y ijk = µ + α i + β j + γ k + ( αβ ) ij + ( αγ ) ik + ( βγ ) jk + ( αβγ ) ijk i, j, k = 1 , 2 5 6 Yates Algorithm Kronecker products 1 N P NP K NK PK NPK Yield (1) (2) (3) 1 94.1 201.6 412.1 867.6 1 1 -1 -1 1 -1 1 1 -1 n 107.5 210.5 455.5 68.8 N 1 1 -1 -1 -1 -1 1 1 p 1 -1 1 -1 -1 1 -1 1 97.0 219.8 29.9 24.8 P np 113.5 235.7 38.9 -10.0 NP 1 1 1 1 -1 -1 -1 -1 k 96.9 13.4 8.9 43.4 K 1 -1 -1 1 1 -1 -1 1 nk 122.9 16.5 15.9 9.0 NK 1 1 -1 -1 1 1 -1 -1 pk 111.4 26.0 3.1 7.0 PK 1 -1 1 -1 1 -1 1 -1 npk 124.3 12.9 -13.1 -16.2 NPK 1 1 1 1 1 1 1 1 7 8

Kronecker products Restatement of problem Evaluate X ′ y where    X 1 − X 1 X =  = X Z ⊗ Z ⊗ Z X 1 X 1   and  1 − 1  ⊗ X 1 ⇒ X =   1 1  1 − 1  . = Z 1 1    1 − 1  ⊗ X 2 X 1 = where Model structure: Kronecker product 1 1 Data structure: ??        1 − 1  1 − 1  1 − 1  ⊗  ⊗ = ⇒ X  1 1 1 1 1 1 9 10 Restatement of problem Evaluate X ′ y where pk npk 111.4 124.3 Data structure: Y , a 3- d array p np 113.5 97.0 Model structure: X = Z ⊗ Z ⊗ Z Evaluate X ′ y where Data structure: Y , n 1 × n 2 nk k 122.9 Model structure: X = X 2 ⊗ X 1 , X i , n i × c i 96.9 94.1 n 107.5 n 107.5 ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 1 Y X 2 11 12

Efficient computation The rotated H -transform ( X 2 ⊗ X 1 ) ′ y ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 Y X 2 = ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ≡ ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ( X ′ 3 ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ) ′ ≡ ??? Example: X 1 , X 2 , 10 3 × 10 2 ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ≡ ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Y ))) • No need to compute X • LHS: 10 10 multiplications Definition: X , n 1 × c 1 matrix; A , c 1 × c 2 × c 3 array. RHS: 10 8 multiplications ρ ( X , A ) → XA c 1 × c 2 c 3 = A n 1 × c 2 c 3 → A n 1 × c 2 × c 3 → A c 2 × c 3 × n 1 • Reformat RHS Example: 2 3 factorial design. Evaluate X ′ y , X = Z ⊗ Z ⊗ Z , Z 2 × 2 , Y 2 × 2 × 2 . ρ ( Z ′ , ρ ( Z ′ , ρ ( Z ′ , Y ))) = ˆ Θ 2 × 2 × 2 13 14 Yates Algorithm Yield (1) (2) (3) PK NPK 1 94.1 201.6 412.1 867.6 1 7.0 −16.2 n 107.5 210.5 455.5 68.8 N p 97.0 219.8 29.9 24.8 P P 24.8 NP −10 np 113.5 235.7 38.9 -10.0 NP k 96.9 13.4 8.9 43.4 K nk 122.9 16.5 15.9 9.0 NK pk 111.4 26.0 3.1 7.0 PK npk 124.3 12.9 -13.1 -16.2 NPK NK K 9.0 43.4 1 867.6 68.8 N 15 16

Computation Generalised linear models Structure X ′ ˜ X ′ ˜ X ˜ θ W δ ˜ z W δ X • Data: a data vector y In 1- d , let X = ( x 1 , . . . , x c ) , n × c . • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 n a parameter vector θ � ( X ′ W δ X ) jk = w i x ij x ik = ( x 1 j x 1 k , . . . , x nj x nk ) w = ( x j ∗ x k ) ′ w a link function i =1 µ = E ( y ) , g ( µ ) = Xθ Definition. The row tensor of X , n × c , is • Error distribution: normal, Poisson, binomial, . . . Algorithm G ( X ) = ( X ⊗ 1 ′ ) ∗ ( 1 ′ ⊗ X ) , n × c 2 • Scoring algorithm where 1 is vector of 1’s of length c . X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z X ′ W δ X , c × c ≡ G ( X ) ′ w , c 2 × 1 − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 17 18 Computation of X ′ W δ X Standard errors of X ˆ θ X i , n i × c i , i = 1 , 2 , 3 . We need X = X 3 ⊗ X 2 ⊗ X 1 , n 1 n 2 n 3 × c 1 c 2 c 3 diag X ( X ′ W δ X ) − 1 X ′ = diag XS m X ′ W δ is diagonal, n 1 n 2 n 3 × n 1 n 2 n 3 where S m , c 1 c 2 c 3 × c 1 c 2 c 3 . W is the corresponding array, n 1 × n 2 × n 3 Let S , c 2 1 × c 2 2 × c 2 3 , be the array form of S m . X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 diag XS m X ′ , n 1 n 2 n 3 × 1 ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 ≡ 3 ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . ≡ 19 20

Computation: a summary Generalised linear array models Structure Linear functions: X ′ y , c 1 c 2 c 3 × 1 • Data: a data array Y , n 1 × n 2 × n 3 ρ ( X ′ 3 , ρ ( X ′ 2 , ρ ( X ′ 1 , Y ))) , c 1 × c 2 × c 3 • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 , X i , n i × c i a parameter array Θ , c 1 × c 2 × c 3 Xθ , n 1 n 2 n 3 × 1 a link function ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) , n 1 × n 2 × n 3 M = E ( Y ) , g ( M ) = ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) . • Error distribution: normal, Poisson, binomial, . . . Inner products : X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 Algorithm ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 3 • Efficient scoring algorithm Array computation of Diagonal function : diag XS m X ′ , n 1 n 2 n 3 × 1 X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 21 22 LM, GLM & GLAM Examples • Factorial designs LM GLM GLAM • Contingency tables Y = Y d Data y y • Smooth models with P -splines µ = Xθ g ( µ ) = Xθ g ( M ) = ρ ( X d , . . . , ρ ( X 1 , Θ ) . . . ) Model Error N N, P, B,... N, P, B,... Algorithm N-equns Scoring Efficient Scoring 23 24

A single cubic B−spline 1−D B−spline basis 0.7 0.6 0.6 0.5 0.5 0.4 Basis functions 0.4 B−spline 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 20 40 60 80 20 40 60 80 25 26 Age Age P -splines in 2- d Structure • Data: deaths matrix, D , exposure matrix, E • Model: a model matrix B = B y ⊗ B a , B a , 80 × 20 , B y , 100 × 25 a parameter array Θ , 20 × 25 a link function 0.5 log( M ) = log E + B a Θ B ′ M = E ( D ) , y . 0.1 0.2 0.3 0.4 • Error distribution: Poisson. Algorithm • Efficient penalised scoring algorithm 0 100 Array computation of 9 9 78 9 1 6 ( B ′ ˜ θ = B ′ ˜ 8 9 W δ B + P )ˆ 56 1 W δ ˜ z Age 3 7 9 1 33 r a 0 e 6 Y − 1 ( d − ˜ 9 1 z = B ˜ θ + ˜ where ˜ W δ µ ) and P = P ( λ a , λ y ) is the penalty matrix. 11 7 4 9 1 27 28

Swedish females: ages 20 to 90 Smooth Lee-Carter models Structure 90 −2 • Data: deaths matrix, D , exposure matrix, E 80 • Model: D ij ∼ P ( µ ij ) , log µ ij = log E ij + α i + β i κ j −4 For given ˆ κ 70 Log(mortality) log( M ) = log E + B a Θ ˆ ′ M = E ( D ) , K 60 where α = B a a , β = B a b , Θ = [ a : b ] , ˆ K = [ 1 : ˆ κ ] . 50 −6 α and ˆ For given ˆ β 40 log( M ) = log E + log( 1 ′ ⊗ ˆ α ) + ˆ M = E ( D ) , βk ′ B ′ k 30 where κ = B k k . −8 20 Algorithm: Efficient penalised scoring algorithm 1900 1920 1940 1960 1980 2000 29 30 Year Microarray data Data are arranged on a 72 × 128 grid in 8 blocks of 2 rows and 4 columns. −1 2.2 −2 2.0 −3 1.8 ′ ′ E ( Y ) = B 3 A ˘ 3 + B 0 C ˘ −4 B B Alpha Beta 0 1.6 −5 1.4 −6 1.2 −7 1.0 −8 20 40 60 80 100 20 40 60 80 100 Age Age −3.6 0.1 −3.8 log(mortality) 0.0 −4.0 Kappa −0.1 −4.2 −0.2 −4.4 Age: 65 −0.3 31 32 1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 Year Year

From Yates Algorithm to Multidimensional Smoothing with Plan of - PowerPoint PPT Presentation

From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM Classical regression Factorial designs and Yates algorithm Iain Currie Yates algorithm as an array computation Heriot Watt University Generalised linear

Query Languages R B R Baeza Yates and B R. Baeza Baeza-Yates and B Yates and B Riberio

EE 355 Unit 5 Multidimensional Arrays Mark Redekopp 2 MULTIDIMENSIONAL ARRAYS 3

Jack Yates High School Houston ISD Assets of Yates High School Magnet Communication School

Variants Chris Yates UCL Cancer Institute c.yates@ucl.ac.uk Outline SAVs and Disease

Power Tuning Linux: A Case Study Alexandra Yates alexandra.yates@intel.com

The Evolution of Web Content and Search Engines Ricardo Baeza Baeza- -Yates Yates Ricardo

Event-consistent smoothing in generalized Introduction Conventional CRS stack high-density

Multidimensional SAR SAR Imaging Imaging: : Multidimensional Studies in the in the Framework

High Capability Multidimensional Data High Capability Multidimensional Data Compression on GPUs

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Chapter 9 Multidimensional Arrays and the ArrayList Class Topics Declaring and

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 Outline Fundamental

Hierarchical Multidimensional Modelling Hierarchical Multidimensional Modelling in the Concept-

Multidimensional Quasi-Cyclic and Convolutional Codes Buket Ozkaya joint work with Cem G

Two new approaches to smoothing over complex regions David Lawrence Miller Mathematical Sciences

Exponential smoothing and non-negative data Muhammad Akram Rob J Hyndman J Keith Ord Business

Marilyn Soucie University of Missouri Omaha, NE October, 2014 Overview of MU Programs

A Conditional-Gradient-Based Augmented Lagrangian Framework Alp Yurtsever alp.yurtsever@epfl.ch

The App Universe After the Big Bang Mike Lee @bmf bmf@le.mu.rs In the

Delphes fast simulation studies 1 P R E S E N T E D B Y S H E R E E N A LY FA C U LT Y O F S

Fine-grained Human Evaluation of Neural versus Phrase-based Machine Translation EAMT, Praha, 31st

Matrix algebra & R as a toy DSM laboratory Distributional Semantic Models Stefan Evert 1 &

Driving factors in benthic macroinvertebrate community composition within the Animas River Alex

WebPlotViz: Browser Visualization of High Dimensional Streaming Data with HTML5 STREAM2016

From Yates Algorithm to Multidimensional Smoothing with Plan of - PowerPoint PPT Presentation

From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM Classical regression Factorial designs and Yates algorithm Iain Currie Yates algorithm as an array computation Heriot Watt University Generalised linear

Query Languages R B R Baeza Yates and B R. Baeza Baeza-Yates and B Yates and B Riberio

EE 355 Unit 5 Multidimensional Arrays Mark Redekopp 2 MULTIDIMENSIONAL ARRAYS 3

Jack Yates High School Houston ISD Assets of Yates High School Magnet Communication School

Variants Chris Yates UCL Cancer Institute c.yates@ucl.ac.uk Outline SAVs and Disease

Power Tuning Linux: A Case Study Alexandra Yates alexandra.yates@intel.com

The Evolution of Web Content and Search Engines Ricardo Baeza Baeza- -Yates Yates Ricardo

Event-consistent smoothing in generalized Introduction Conventional CRS stack high-density

Multidimensional SAR SAR Imaging Imaging: : Multidimensional Studies in the in the Framework

High Capability Multidimensional Data High Capability Multidimensional Data Compression on GPUs

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Chapter 9 Multidimensional Arrays and the ArrayList Class Topics Declaring and

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 Outline Fundamental

Hierarchical Multidimensional Modelling Hierarchical Multidimensional Modelling in the Concept-

Multidimensional Quasi-Cyclic and Convolutional Codes Buket Ozkaya joint work with Cem G

Two new approaches to smoothing over complex regions David Lawrence Miller Mathematical Sciences

Exponential smoothing and non-negative data Muhammad Akram Rob J Hyndman J Keith Ord Business

Marilyn Soucie University of Missouri Omaha, NE October, 2014 Overview of MU Programs

A Conditional-Gradient-Based Augmented Lagrangian Framework Alp Yurtsever alp.yurtsever@epfl.ch

The App Universe After the Big Bang Mike Lee @bmf bmf@le.mu.rs In the

Delphes fast simulation studies 1 P R E S E N T E D B Y S H E R E E N A LY FA C U LT Y O F S

Fine-grained Human Evaluation of Neural versus Phrase-based Machine Translation EAMT, Praha, 31st

Matrix algebra &amp; R as a toy DSM laboratory Distributional Semantic Models Stefan Evert 1 &amp;

Driving factors in benthic macroinvertebrate community composition within the Animas River Alex

WebPlotViz: Browser Visualization of High Dimensional Streaming Data with HTML5 STREAM2016

Matrix algebra & R as a toy DSM laboratory Distributional Semantic Models Stefan Evert 1 &