tensor decomposition for healthcare analytics
play

Tensor Decomposition for Healthcare Analytics Matteo Ruffini - PowerPoint PPT Presentation

Tensor Decomposition for Healthcare Analytics Matteo Ruffini Laboratory for Relational Algorithmic, Complexity and Learning matteo.ruffini@estudiant.upc.edu November 5, 2017 Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5,


  1. Tensor Decomposition for Healthcare Analytics Matteo Ruffini Laboratory for Relational Algorithmic, Complexity and Learning matteo.ruffini@estudiant.upc.edu November 5, 2017 Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 1 / 36

  2. Overview Overview 1 Clustering 2 Mixture Model Clustering Tensor Decomposition Mixture of independent Bernoulli Applications to Healthcare Analytics 3 Data and objectives Results Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 2 / 36

  3. Overview Task : to segment patients in groups with similar clinical profiles. 1 Similar patients → Similar cares. 2 Find recurrent comorbidities. 3 Assigning and planning resources: drugs and doctors. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 3 / 36

  4. Overview Task : to segment patients in groups with similar clinical profiles. 1 Similar patients → Similar cares. 2 Find recurrent comorbidities. 3 Assigning and planning resources: drugs and doctors. Data : Electronic Healthcare Records (EHR). Objective : Use these data to create clusters of patients. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 3 / 36

  5. Example: ICD-9 EHR In ICD code, to each disease is associated a number 278 → Obesity , 401 → Hypertension Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 4 / 36

  6. Example: ICD-9 EHR In ICD code, to each disease is associated a number 278 → Obesity , 401 → Hypertension Records : list of patients with their diseases → patient-disease matrix. Diseases 820 401 278 560 Patient 1 820, 401 Patient 1 1 1 0 0 Patient 2 401, 278, Patient 2 0 1 1 0 Patient 3 560, 820, 278 Patient 3 1 0 1 1 Objective : cluster the rows of the patient-disease matrix. Sparse and high dimensional data. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 4 / 36

  7. Clustering Clustering : one of the fundamental tasks of Machine Learning. Objective : Dataset of N samples → partition in coherent subsets Dataset : a matrix X ∈ R N × n X ( i ) = ( x ( i ) 1 , ..., x ( i ) n ) Group together similar rows. Standard methods: k-means, k-medioids, single linkage ... Distance-based: poor performances on high dimensional sparse data. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 5 / 36

  8. Mixture Models Definition (Mixture Model) Y ∈ { 1 , ..., k } A latent discrete variable. X = ( x 1 , . . . , x n ) observable, depends on Y . Y k � P ( X ) = P ( Y = i ) P ( X | Y = i ) . . . i =1 x 1 x 2 x n x i are called features . Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 6 / 36

  9. Mixture Models Definition (Mixture Model) Y ∈ { 1 , ..., k } A latent discrete variable. X = ( x 1 , . . . , x n ) observable, depends on Y . Y k � P ( X ) = P ( Y = i ) P ( X | Y = i ) . . . i =1 x 1 x 2 x n x i are called features . Generative process for one sample: 1 Draw Y , obtain Y = i ∈ { 1 , ..., k } . 2 Draw X ∈ R n ≈ P ( X | Y = i ) Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 6 / 36

  10. Mixture Model Clustering Clustering From an outcome of X (observed) → Infer the outcome of Y (unknown) k clusters . Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 7 / 36

  11. Mixture Model Clustering Clustering From an outcome of X (observed) → Infer the outcome of Y (unknown) k clusters . Parameters characterizing a mixture model: ω h := P ( Y = h ) , ω := ( ω 1 , . . . , ω k ) ⊤ , Ω := diag ( ω ) . M = ( µ i , j ) i , j = [ µ 1 | , ..., | µ k ] ∈ R n × k µ i , j = E ( x i | Y = j ) , If conditional distributions and the model parameters are known: P ( Y = j | X , M , ω ) ∝ P ( X | Y = j , M ) ω j Cluster ( X ) = arg max P ( Y = j | X , M , ω ) j =1 ,..., k It is crucial to know the parameters of the model ( M , ω ). Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 7 / 36

  12. Mixture of Independent Bernoulli Observables are binary and conditionally independent: x i ∈ { 0 , 1 } . The expectations coincide with the probability of a positive outcome. µ i , j = P ( x i = 1 | Y = j ) . n � µ x i i , j (1 − µ i , j ) 1 − x i P ( Y = j | X ) ∝ ω j i =1 Clustering Rule: n � µ x i i , j (1 − µ i , j ) 1 − x i Cluster ( X ) = arg max ω j j =1 ,..., k i =1 Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 8 / 36

  13. Mixture Model Clustering: sum up Advantages : Robust to irrelevant features: P ( x i ) = P ( x i | Y = j ) Algorithms with provable guarantees of optimality. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 9 / 36

  14. Mixture Model Clustering: sum up Advantages : Robust to irrelevant features: P ( x i ) = P ( x i | Y = j ) Algorithms with provable guarantees of optimality. Disadvantages : Model assumption on the reality. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 9 / 36

  15. Mixture Model Clustering: sum up Advantages : Robust to irrelevant features: P ( x i ) = P ( x i | Y = j ) Algorithms with provable guarantees of optimality. Disadvantages : Model assumption on the reality. To sum up : Two steps: 1 Estimate the parameters of the mixture. 2 Group together similar elements, using Bayes’ theorem. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 9 / 36

  16. Learning mixture parameters Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 10 / 36

  17. Maximum Likelihood Estimate Standard method Maximum Likelihood. Find parameters Θ = ( M , ω ) maximizing the likelihood on X ∈ R N × n N k � � P ( X ( i ) | Y = j , M ) ω j max Θ P ( X , Θ) = max Θ i =1 j =1 Maximizing this is hard In general there are no closed form solutions. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 11 / 36

  18. Expectation Maximization (EM) Iterative algorithm from [Dempster et al.(1977)] 1 Randomly initialize ( M , ω ) 2 Cluster the samples. 3 Use the clusters to recalculate ( M , ω ). 4 Iterate over steps 2 and 3 until convergence. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 12 / 36

  19. Expectation Maximization (EM) Iterative algorithm from [Dempster et al.(1977)] 1 Randomly initialize ( M , ω ) 2 Cluster the samples. 3 Use the clusters to recalculate ( M , ω ). 4 Iterate over steps 2 and 3 until convergence. Pro and cons Iteratively increases the likelihood. No guarantees of reaching global optimum. EM is slow. The quality of the results depends on the initialization: Good starting points → Good outputs Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 12 / 36

  20. Alternative Approach: Tensor Decomposition A general approach, outlined in [Anandkumar et al., 2014]. Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 13 / 36

  21. Alternative Approach: Tensor Decomposition A general approach, outlined in [Anandkumar et al., 2014]. 1 Estimate (Recall: M = [ µ 1 | , ..., | µ k ], µ i = E [ X | Y = i ] ∈ R n ). M 1 := M ω ∈ R n M 2 := M diag ( ω ) M ⊤ ∈ R n × n , M 3 := � k i =1 ω i µ i ⊗ µ i ⊗ µ i ∈ R n × n × n Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 13 / 36

  22. Alternative Approach: Tensor Decomposition A general approach, outlined in [Anandkumar et al., 2014]. 1 Estimate (Recall: M = [ µ 1 | , ..., | µ k ], µ i = E [ X | Y = i ] ∈ R n ). M 1 := M ω ∈ R n M 2 := M diag ( ω ) M ⊤ ∈ R n × n , M 3 := � k i =1 ω i µ i ⊗ µ i ⊗ µ i ∈ R n × n × n 2 Retrieve ( M , ω ) with a tensor decomposition algorithm A : A ( M 1 , M 2 , M 3 ) → ( M , ω ) Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 13 / 36

  23. Alternative Approach: Tensor Decomposition A general approach, outlined in [Anandkumar et al., 2014]. 1 Estimate (Recall: M = [ µ 1 | , ..., | µ k ], µ i = E [ X | Y = i ] ∈ R n ). M 1 := M ω ∈ R n M 2 := M diag ( ω ) M ⊤ ∈ R n × n , M 3 := � k i =1 ω i µ i ⊗ µ i ⊗ µ i ∈ R n × n × n 2 Retrieve ( M , ω ) with a tensor decomposition algorithm A : A ( M 1 , M 2 , M 3 ) → ( M , ω ) Step 1: Depends on the specific properties of the mixture. Step 2 : Is general (need assumptions on M ). Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 13 / 36

  24. Example: Mixture of Independent Gaussians Dataset X ∈ R N × n with iid rows X ( i ) = ( x ( i ) 1 , ..., x ( i ) n ). Model settings : x ( i ) and x ( i ) are conditionally independent ∀ h � = l . h l x ( i ) conditioned to Y is a Gaussian, with known stdev σ : h P ( x h | Y = i ) ≈ N ( µ h , i , σ ) Matteo Ruffini (UPC) Tensor Decomposition for Healthcare November 5, 2017 14 / 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend