linear models are most favorable among generalized linear
play

Linear Models are Most Favorable among Generalized Linear Models - PowerPoint PPT Presentation

Linear Models are Most Favorable among Generalized Linear Models Kuan-Yun Lee and Thomas A. Courtade Electrical Engineering and Computer Sciences University of California, Berkeley ISIT 2020 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax


  1. Linear Models are Most Favorable among Generalized Linear Models Kuan-Yun Lee and Thomas A. Courtade Electrical Engineering and Computer Sciences University of California, Berkeley ISIT 2020 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 1 / 20

  2. Overview Introduction and Main Results 1 Keypoints of Proof 2 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 2 / 20

  3. Overview Introduction and Main Results 1 Keypoints of Proof 2 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 3 / 20

  4. Introduction Given X := ( X 1 , . . . , X n ) ∼ f ( · ; θ ) Linear regression: X = M θ + Z Phase retrieval: X i = � m i , θ � 2 + Z i Group testing: X i = δ ( � m i , θ � ) M ⊤ � � Matrix retrieval: X i = Tr i θ when θ is a matrix . . . Many other settings with sparsity, structural assumptions on M , etc. K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 4 / 20

  5. Introduction Given X := ( X 1 , . . . , X n ) ∼ f ( · ; θ ) Linear regression: X = M θ + Z Phase retrieval: X i = � m i , θ � 2 + Z i Group testing: X i = δ ( � m i , θ � ) M ⊤ � � Matrix retrieval: X i = Tr i θ when θ is a matrix . . . Many other settings with sparsity, structural assumptions on M , etc. Key Question How well can we estimate θ from observations X ∼ f ( · ; θ )? K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 4 / 20

  6. ✶ Introduction Consider the classical linear model X = M θ + Z under constraint θ ∈ Θ Fundamental Question θ sup θ ∈ Θ E L ( θ, ˆ Given a loss function L ( · , · ), what is inf ˆ θ )? K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 5 / 20

  7. Introduction Consider the classical linear model X = M θ + Z under constraint θ ∈ Θ Fundamental Question θ sup θ ∈ Θ E L ( θ, ˆ Given a loss function L ( · , · ), what is inf ˆ θ )? Loss functions L ( θ, ˆ θ ): Constraints on Θ: � θ − ˆ θ � 2 2 (estimation error) Θ is L p ball � M θ − M ˆ θ � 2 2 (prediction error) Θ is a matrix space with rank constraints ✶ (supp( θ ) = supp(ˆ θ )) K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 5 / 20

  8. Introduction In this talk, we will focus on estimation error L ( θ, ˆ θ ) := � θ − ˆ θ � 2 2 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 6 / 20

  9. Introduction In this talk, we will focus on estimation error L ( θ, ˆ θ ) := � θ − ˆ θ � 2 2 Consider X = M θ + Z ∈ R n with fixed design matrix M ∈ R n × d , Θ = R d , Z ∼ N (0 , σ 2 · I n ). Suppose M has full column rank, then, ˆ θ MLE := ( M ⊤ M ) − 1 M ⊤ X θ sup θ ∈ R d E � θ − ˆ θ � 2 achieves the minimax error inf ˆ 2 , and 2 = σ 2 · Tr(( M ⊤ M ) − 1 ) . E � θ − ˆ θ MLE � 2 2 = E � ( M ⊤ M ) − 1 M ⊤ Z � 2 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 6 / 20

  10. Introduction In this talk, we will focus on estimation error L ( θ, ˆ θ ) := � θ − ˆ θ � 2 2 Consider X = M θ + Z ∈ R n with fixed design matrix M ∈ R n × d , Θ = R d , Z ∼ N (0 , σ 2 · I n ). Suppose M has full column rank, then, ˆ θ MLE := ( M ⊤ M ) − 1 M ⊤ X θ sup θ ∈ R d E � θ − ˆ θ � 2 achieves the minimax error inf ˆ 2 , and 2 = σ 2 · Tr(( M ⊤ M ) − 1 ) . E � θ − ˆ θ MLE � 2 2 = E � ( M ⊤ M ) − 1 M ⊤ Z � 2 Follow up question Can we generalize this? The Gaussian distribution falls into the exponential family The linear model falls into the family of generalized linear models K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 6 / 20

  11. Exponential Family Density of X ∈ R given natural parameter η ∈ R � η x − Φ( η ) � f ( x ; η ) = h ( x ) exp s ( σ ) h : X ⊆ R → [0 , ∞ ) (the base measure ) Φ : R → R (the cumulant function ) s ( σ ) > 0: scale parameter Examples: � � 1 : h ( x ) = 1, Φ( t ) = log(1 + e t ) and s ( σ ) = 1 Bernoulli 1+ e − η 2 π e − x 2 / 2 , Φ( t ) = t 2 / 2 and s ( σ ) = 1 1 Gaussian( η, 1): h ( x ) = √ Exponential( η ): h ( x ) = 1, Φ( t ) = − log t and s ( σ ) = 1 Poisson( e η ): h ( x ) = 1 / x !, Φ( t ) = e t and s ( σ ) = 1 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 7 / 20

  12. Generalized Linear Models Density of X ∈ R n given parameter M θ ∈ R n n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp s ( σ ) i =1 h : X ⊆ R → [0 , ∞ ) (the base measure ) Φ : R → R (the cumulant function ) s ( σ ) > 0: scale parameter Examples: � � 1 : h ( x ) = 1, Φ( t ) = log(1 + e t ) and s ( σ ) = 1 Bernoulli 1+ e −� mi ,θ � 2 π e − x 2 / 2 , Φ( t ) = t 2 / 2 and s ( σ ) = 1 1 Gaussian( � m i , θ � , 1): h ( x ) = √ Exponential( � m i , θ � ): h ( x ) = 1, Φ( t ) = − log t and s ( σ ) = 1 Poisson( e � m i ,θ � ): h ( x ) = 1 / x !, Φ( t ) = e t and s ( σ ) = 1 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 8 / 20

  13. Generalized Linear Models Density of X ∈ R n given parameter M θ ∈ R n n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp s ( σ ) i =1 We make one common assumption: Φ ′′ ≤ L . K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 9 / 20

  14. Generalized Linear Models Density of X ∈ R n given parameter M θ ∈ R n n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp s ( σ ) i =1 We make one common assumption: Φ ′′ ≤ L . The variance of X i is s ( σ ) · Φ ′′ ( � m i , θ � ) K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 9 / 20

  15. Generalized Linear Models Density of X ∈ R n given parameter M θ ∈ R n n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp s ( σ ) i =1 We make one common assumption: Φ ′′ ≤ L . The variance of X i is s ( σ ) · Φ ′′ ( � m i , θ � ) In the Gaussian case, Φ ′′ ( t ) = 1 e t In the Bernoulli case, Φ ′′ ( t ) = (1+ e t ) 2 ≤ 1 In the Poisson case, Φ ′′ ( t ) = e t In the Exponential case, Φ ′′ ( t ) = 1 t 2 Corresponds to structural assumptions on M and Θ K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 9 / 20

  16. Generalized Linear Models Theorem Given observations X ∈ R n generated from the GLM with fixed M ∈ R n × d , n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp , s ( σ ) i =1 2 ≤ 1 } and Φ ′′ ≤ L, with Θ := B 2 d (1) := { θ : θ ∈ R d , � θ � 2 � 1 , s ( σ ) � E � θ − ˆ θ � 2 Tr(( M ⊤ M ) − 1 ) inf sup 2 � min L ˆ θ θ ∈ Θ K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 10 / 20

  17. Generalized Linear Models Theorem Given observations X ∈ R n generated from the GLM with fixed M ∈ R n × d , n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp , s ( σ ) i =1 2 ≤ 1 } and Φ ′′ ≤ L, with Θ := B 2 d (1) := { θ : θ ∈ R d , � θ � 2 � 1 , s ( σ ) � E � θ − ˆ θ � 2 Tr(( M ⊤ M ) − 1 ) inf sup 2 � min L ˆ θ θ ∈ Θ When M ⊤ M has a zero eigenvalue, adopt Tr(( M ⊤ M ) − 1 ) := + ∞ K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 10 / 20

  18. Generalized Linear Models Theorem Given observations X ∈ R n generated from the GLM with fixed M ∈ R n × d , n � � m i , θ � x i − Φ( � m i , θ � ) � � f ( x ; M , θ ) = h ( x i ) exp , s ( σ ) i =1 2 ≤ 1 } and Φ ′′ ≤ L, with Θ := B 2 d (1) := { θ : θ ∈ R d , � θ � 2 � 1 , s ( σ ) � E � θ − ˆ θ � 2 Tr(( M ⊤ M ) − 1 ) inf sup 2 � min L ˆ θ θ ∈ Θ When M ⊤ M has a zero eigenvalue, adopt Tr(( M ⊤ M ) − 1 ) := + ∞ Gaussian linear model with X = LM θ + Z matches with equality and is extremal in this family of GLMs K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 10 / 20

  19. Overview Introduction and Main Results 1 Keypoints of Proof 2 K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 11 / 20

  20. Upper Bound on Mutual Information Consider X ∼ f ( · ; θ ). The Fisher information I X ( θ ) is defined as � |∇ θ f ( X ; θ ) | 2 � I X ( θ ) := E X f 2 ( X ; θ ) K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 12 / 20

  21. Upper Bound on Mutual Information Consider X ∼ f ( · ; θ ). The Fisher information I X ( θ ) is defined as � |∇ θ f ( X ; θ ) | 2 � I X ( θ ) := E X f 2 ( X ; θ ) � Regularity assumption: X ∇ θ f ( x ; θ ) d λ ( x ) = 0 for almost every θ and θ → f ( x ; θ ) is (weakly) differentiable for λ -a.e. x . Theorem (Aras, Lee, Pananjady, Courtade, 2019) Let θ ∼ π , where π is log-concave on R d , and let X ∼ f ( · ; θ ). If the regularity condition is satisfied, � Tr( Cov ( θ )) · E θ I X ( θ ) � I ( θ ; X ) ≤ d · φ d 2 � √ x if 0 ≤ x < 1 where φ ( x ) := 1 + 1 2 log x if x ≥ 1 . K.-Y. Lee and T. A. Courtade (Berkeley) New Minimax Bound for the GLM ISIT 2020 12 / 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend