Practical problem s in m ulti-w ay analysis Constraints Missing - PowerPoint PPT Presentation

Practical problem s in m ulti-w ay analysis • Constraints • Missing data • Jackknifing and split-half analysis Some examples: • SLICING – recovering exponentials • Fluorescence EEM data • Chromatographic data Rasmus Bro

Constraints in PARAFAC rb@life.ku.dk

W hy use constraints Data • Sometimes the parameters of the model can have a very direct physical meaning (spectra, concentrations, 800 900 1000 1100 1200 1300 1400 1500 1600 elution profiles) Wavelength [nm] Loadings PCA • It is natural to require these parameters to “make sense” • This may be accomplished by fitting a model under 800 900 1000 1100 1200 1300 1400 1500 1600 additional constraints Wavelength [nm] rb@life.ku.dk

W hy constraints? • Obtain sensible param eters • Ex.: Require chromatographic profiles to have but one peak • Obtain unique solution • Ex.: Use selective channels in data to obtain uniqueness Avoiding degeneracy and num erical problem s • • Ex.: Enabling a PARAFAC model of data otherwise inappropriate for the model • Speed up algorithm s • Ex.: Use truncated bases to reexpress problem by a smaller problem rb@life.ku.dk

Typical constraints Spectroscopy Chromatography FI A Kinetics Auto- & cross correlation Uncertainty Nonnegativity Unimodality Selectivity Smoothness Known spectra …

PARAFAC - algorithm • Algorithm - Alternating least squares (ALS) Ex.: Bilinear model : | | X – AB T | | 1. B T = ( A T A ) -1 A T X = A + X 2. A T = ( B T B ) -1 B T X T = B + X T 3. Goto 1 until convergence (small change in fit | | X - AB T | | ) rb@life.ku.dk

PARAFAC - algorithm Why ALS? B C 1. Initialize and Simple   Extends to N-way K         1   A X BD B B C C 2.  ' * ' Handles missing k k    Handles ML fitting k 1   Constraints: K         1   B X AD A A C C 3. '  ' * ' • Nonnegativity k k   • Unimodality  k 1 • Orthogonality          1  • Linear constraints D B B A A A X B 4. diag ' * ' diag ' , k = 1,.., K k k • Fixed parameters • Smoothness 5. Step 2 until relative change in fit is small • Functional • etc rb@life.ku.dk

FI A exam ple Model of a single sam ple w ith one analyte • Samples of 2, 3, 4-HBA (hydroxy benzaldehyde) 0.3 0.2 • UV-VIS FIA system with pH- 0.1 Absorbance gradient imposed 0 450 400 100 350 50 300 250 Wavelength 0 Time • Spectrum sum of acidic and basic spectrum. Same for time profiles. 0.06 0.15 Only sums are measured 0.04 0.1 Absorbance Absorbance 0.02 0.05 0 0 0 20 40 60 80 100 250 300 350 400 450 • Model not important here. Time Wavelength/nm PARALIND: X k = AHD k B ’ (Morteza Bahram will talk about this)

Effect of constraints • Eq : Equality of summed profiles • NNLS : Non-negativity of all parameters • ULSR : Unimodality of FIAgrams/ time profiles • Fix : Fixing purely acidic/ basic times to only reflect acidic/ basic analytes Concentrations 2HBA 3HBA 4HBA Eq 0.9769 0.9837 0.9979 NNLS 0.9988 0.9787 0.9996 NNLS/Eq 0.9992 0.9987 0.9996 Non-negativity & equality constrained NNLS/ULSR/Eq 0.9992 0.9987 0.9996 0.2 NNLS/ULSR/Fix/Eq 0.9990 0.9987 0.9996 4HBA - a.u. 0 0 20 40 60 80 100 Time

Effect of constraints • Eq : Equality of summed profiles • NNLS : Non-negativity of all parameters • ULSR : Unimodality of FIAgrams/ time profiles • Fix : Fixing purely acidic/ basic times to only reflect acidic/ basic analytes SPECTRA 2HBA 2HBA 3HBA 3HBA 4HBA 4HBA acidic basic acidic basic acidic basic Eq 0.9893* 0.9871* 0.9689* 0.7647* 0.9106* 0.9211* NNLS 0.9944 0.9117* 0.9952 0.9241 0.9974 0.9977 NNLS/Eq 0.9946 0.9312* 0.9953 0.9988 0.9965 0.9971 NNLS/ULSR/Eq 0.9946 0.9590* 0.9953 0.9989 0.9966 0.9943 NNLS/ULSR/Fix/Eq 0.9946 0.9989 0.9954 0.9986 0.9961 0.9977

Data analysis requires good data – g.i.g.o. Exam ple Fluorescence excitation- emission matrix contains chemical information that PARAFAC can handle and physical scattering signals that do not fit PARAFAC

Know ing your data Chem ical part

Know ing your data MILES – maximum likelihood Downweigh areas of less e T W T We importance Done by extending least w 12 w 142 squares fit to weighted and w 22 w 32 off-diagonal-weighted least w 42 w 4J2 squares w J2

MI LES – m axim um likelihood • Algorithm MI LES (Maximum likelihood via Iterative Least squares EStimation) based on Majorization • Enables weighted least squares and maximum likelihood fitting of any model which has a least squares algorithm Given vectorized data x and weights W 1. Initialize model, m 0 , with LS, set c := 0;     T q m W W x m 1 / ( ) 2. c c 2  m q argmin 3. m c+ 1 = F  m 4. c := c+ 1; go to step 2 until convergence Calculate q Fit LS model to q instead of to data

21 samples containing L-phenylalanine, L-3,4-dihydroxy-phenyl-alanine (DOPA), 1,4-dihydroxy-benzene & L-tryptophan • Three types of unwanted variation • Measurement error (~ iid Gaussian) • Rayleigh and Raman scatter • Non-chemical area Baunsgaard D, Factors affecting 3-way modelling (PARAFAC) of fluorescence landscapes, The Royal Veterinary & Agricultural University, 1999

PARAFAC results Artifact RAW DATA Least squares PARAFAC

MILES PARAFAC MILES interpretation of data PARAFAC results Artifact RAW DATA Least squares PARAFAC

Bootstrapping a bit Em ission spectra from 1 0 0 resam plings 0.6 Loading 0.2 -0.2 260 280 300 320 340 360 380 400 420 0.4 Loading 0.2 0 -0.2 260 280 300 320 340 360 380 400 420 Emission /nm R. Bro, N. D. Sidiropoulos, and A. K. Smilde. Maximum likelihood fitting using simple least squares algorithms. J.Chemometrics, 2002

Missing data rb@life.ku.dk

Missing data No m issing 2   I J F     x t p  Ex.: standard PCA loss function | | X - TP ’| | = ij if jf      i 1 j 1 f 1 I.e., a summation of errors over all elements of X I f m issing Only fit the model to the data that exist 2   I J F    w  x t p  I.e., fit to the loss function ij ij if jf      i 1 j 1 f 1 where w ij is zero if x ij is missing and one otherwise

Missing data How can that loss function be optimized? Method 1 : use weighted least squares regression Method 2 : use imputation 1. Put numbers in missing elements 2. Fit model to these ‘wrong’ data (Ex: M = TP ’ in PCA) 3. Replace missing elements with model guess (Ex: x ij = M ij in PCA) 4. Go to step 2 until convergence Both methods give same result. Method 2 is easy to implement, Method 1 sometimes faster, but more memory-demanding

Missing data Ex.: Fluorescence data 15% missing data Three PCA components should be sufficient Ad hoc approaches such as NIPALS do not work (too many significant components)

Jackknifing and split-half analysis rb@life.ku.dk

Jackknifing and split-half analysis W hat is the problem ? • Detect outliers • Uncertainty measures are difficult to define • And assumptions are hardly ever met anyway Jack-knifing* is a solution • Based on resampling (cross-validation) • Works regardless of model structure • No distributional assumptions except that resampled objects are independent *J.W.Tukey Annals of Mathematical Statistics 1958, 29 , 614. rb@life.ku.dk

Jack-knifed PARAFAC rb@life.ku.dk

Prelim inary m odel Emission spectral profiles 2nd sample 2nd sample 5th sample 1st sample 250 300 350 400 450 500 250 300 350 400 450 500 wavelength wavelength 2nd sample 2nd sample 3rd sample 3rd sample 250 300 350 400 450 500 250 300 350 400 450 500 wavelength

Prelim inary m odel Detection of outliers: loadings RIP plot (resampled influence plot)   F J  2  560 560 b b  2 2 overall, jf m jf ,   f 1 j 1 -loadings difference -loadings difference 3 3 35 30 30 25 20 20 b b 15 10 10 5 1 1 14 15 14 15 11 11 12 12 21 21 16 16 19 19 10 10 20 20 5 5 1718 1718 2627 2627 25 25 22 22 9 9 13 13 24 24 23 23 4 4 7 7 8 8 6 6 2 2 0 0 0 0 1 1 4 4 sum of squared residuals 6 x 10

Practical problem s in m ulti-w ay analysis Constraints Missing - PowerPoint PPT Presentation

Practical problem s in m ulti-w ay analysis Constraints Missing data Jackknifing and split-half analysis Some examples: SLICING recovering exponentials Fluorescence EEM data Chromatographic data Rasmus Bro

Mobile Communications Towards 2020 Carlos Caseiro January 2017 Evolution Mobile Networks

Clock lock Tree ee Res esynt nthes hesis is for or Mult ulti-cor i-corner ner Mult

CMU 15-781 Lecture 21: Multi-Robot Systems Teacher: Gianni A. Di Caro M ULTI -R OBOT S YSTEMS ?

Practical Experience with Practical Experience with Practical Experience with Practical

Problem Definition Problem Definition Problem Definition Problem Definition Problem Definition

Change from a Practical Perspective Change from a Practical Perspective Change from a Practical

Texture Synthesis Presented by James Hays Problem Statement 1 Problem Statement Problem

T12 Thursday, May 18, 2006 1:30PM A UTOMATED S ETUP AND T EAR D OWN OF C OMPLEX , M ULTI - TIER T

Real-World applications of Boosting Yoav Freund UCSD Practical Advantages of AdaBoost

Practical Bioinformatics Mark Voorhies 5/15/2015 Mark Voorhies Practical Bioinformatics

CSpace CSpace CSpace CSpace A More Practical and A More Practical and A

ARDUINO & ELECTRONICS PRACTICAL PRACTICAL SESSION 1 Part of SmartProducts ARDUINO &

Introduction to Microarray Data Analysis and Gene Networks Lecture 3 and practical Alvis Brazma

MIRO: M Multi ulti- -path path MIRO: Interdomain nterdomain RO ROuting uting I Wen Xu

Speckle Imaging with IMAGIN A M ulti A perture Imaging Simulation Arun Surya A good idea is

A Lo calization T echnique bots F r o M ulti-Agent Prateek Humane and Neelay Trivedi R

Multiple Antenna Techniques 1 Introduction 2 Introduction In mobile systems, a key requirement

AND MALWARE Ben Livshits, Microsoft Research Overview of Todays Lecture 2 Viruses

Designed By Dr Patrick Byrnes www.patsoftware.com.au PAT is a 3 stage process 1. Nurse checklist

The Marburg Agreement Project Corpus, annotation and preliminary results Magnus Breder Birkenes,

Overview and Evaluation Activities Sarah M. Greene, Associate Director, CER Methods &

Hot Topics in REI Its not so simple October 18, 2019 Eleni Greenwood Jaswa, MD MSc

Touch & Haptics Blind and deaf people have been using touch to substitute vision or hearing

Lindemans Lectures: Virtual Reality & Serious Games (Part 1) Robert W. Lindeman

Practical problem s in m ulti-w ay analysis Constraints Missing - PowerPoint PPT Presentation

Practical problem s in m ulti-w ay analysis Constraints Missing data Jackknifing and split-half analysis Some examples: SLICING recovering exponentials Fluorescence EEM data Chromatographic data Rasmus Bro

Mobile Communications Towards 2020 Carlos Caseiro January 2017 Evolution Mobile Networks

Clock lock Tree ee Res esynt nthes hesis is for or Mult ulti-cor i-corner ner Mult

CMU 15-781 Lecture 21: Multi-Robot Systems Teacher: Gianni A. Di Caro M ULTI -R OBOT S YSTEMS ?

Practical Experience with Practical Experience with Practical Experience with Practical

Problem Definition Problem Definition Problem Definition Problem Definition Problem Definition

Change from a Practical Perspective Change from a Practical Perspective Change from a Practical

Texture Synthesis Presented by James Hays Problem Statement 1 Problem Statement Problem

T12 Thursday, May 18, 2006 1:30PM A UTOMATED S ETUP AND T EAR D OWN OF C OMPLEX , M ULTI - TIER T

Real-World applications of Boosting Yoav Freund UCSD Practical Advantages of AdaBoost

Practical Bioinformatics Mark Voorhies 5/15/2015 Mark Voorhies Practical Bioinformatics

CSpace CSpace CSpace CSpace A More Practical and A More Practical and A

ARDUINO &amp; ELECTRONICS PRACTICAL PRACTICAL SESSION 1 Part of SmartProducts ARDUINO &amp;

Introduction to Microarray Data Analysis and Gene Networks Lecture 3 and practical Alvis Brazma

MIRO: M Multi ulti- -path path MIRO: Interdomain nterdomain RO ROuting uting I Wen Xu

Speckle Imaging with IMAGIN A M ulti A perture Imaging Simulation Arun Surya A good idea is

A Lo calization T echnique bots F r o M ulti-Agent Prateek Humane and Neelay Trivedi R

Multiple Antenna Techniques 1 Introduction 2 Introduction In mobile systems, a key requirement

AND MALWARE Ben Livshits, Microsoft Research Overview of Todays Lecture 2 Viruses

Designed By Dr Patrick Byrnes www.patsoftware.com.au PAT is a 3 stage process 1. Nurse checklist

The Marburg Agreement Project Corpus, annotation and preliminary results Magnus Breder Birkenes,

Overview and Evaluation Activities Sarah M. Greene, Associate Director, CER Methods &amp;

Hot Topics in REI Its not so simple October 18, 2019 Eleni Greenwood Jaswa, MD MSc

Touch &amp; Haptics Blind and deaf people have been using touch to substitute vision or hearing

Lindemans Lectures: Virtual Reality &amp; Serious Games (Part 1) Robert W. Lindeman

ARDUINO & ELECTRONICS PRACTICAL PRACTICAL SESSION 1 Part of SmartProducts ARDUINO &

Overview and Evaluation Activities Sarah M. Greene, Associate Director, CER Methods &

Touch & Haptics Blind and deaf people have been using touch to substitute vision or hearing

Lindemans Lectures: Virtual Reality & Serious Games (Part 1) Robert W. Lindeman