conjugate prior summary
play

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( - PowerPoint PPT Presentation

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( ) Distribution (1 ) (1 x ) x (1 ) ( a 1) ( b 1) Bernoulli Beta (1 ) ( N x ) x (1 ) ( a 1) ( b


  1. Conjugate prior summary Distribution Likelihood p ( x | θ ) Prior p ( θ ) Distribution (1 − θ ) (1 − x ) θ x ∝ (1 − θ ) ( a − 1) θ ( b − 1) Bernoulli Beta ∝ (1 − θ ) ( N − x ) θ x ∝ (1 − θ ) ( a − 1) θ ( b − 1) Binomial Beta ∝ θ x 1 1 θ x 2 2 θ x 3 ∝ θ α 1 − 1 θ α 2 − 1 θ α 3 − 1 Multinomial Dirichlet 3 1 2 3 Normal � − ( x − θ ) 2 � � − ( θ − µ 0 ) 2 � ∝ exp ∝ exp Normal (fixed σ 2 ) 2 σ 2 2 σ 2 0 √ Normal � − θ ( x − µ ) 2 � ∝ θ a − 1 exp ( − b θ ) ∝ θ exp Gamma 2 (fixed µ ) ∝ θ x exp( − θ ) ∝ θ a − 1 exp ( − b θ ) Poisson Gamma S. Cheng (OU-Tulsa) October 3, 2017 1 / 22

  2. Lecture 7 Constraint optimization An example Simple economy: m prosumers, n different goods 1 Each individual: production p i ∈ R n , consumption c i ∈ R n Expense of producing “ p ” for agent i = e i ( p ) Utility (happiness) of consuming “ c ” units for agent i = u i ( c ) Maximize happiness � � � max ( u i ( c i ) − e i ( p i )) c i = p i s . t . p i , c i i i i 1 Example borrowed from the first lecture of Prof Gordon’s CMU CS 10-725 S. Cheng (OU-Tulsa) October 3, 2017 2 / 22

  3. Lecture 7 Constraint optimization Walrasian equilibrium � � � max ( u i ( c i ) − e i ( p i )) c i = p i s . t . p i , c i i i i Idea: introduce price λ j to each good j . Let the market decide Price λ j ↑ : consumption of good j ↓ , production of good j ↑ Price λ j ↓ : consumption of good j ↑ , production of good j ↓ Can adjust price until consumption = production for each good S. Cheng (OU-Tulsa) October 3, 2017 3 / 22

  4. Lecture 7 Constraint optimization Algorithm: tˆ atonnement Assume that the appropriate prices are found, we can ignore the equality constraint, then the problem becomes � � max ( u i ( c i ) − e i ( p i )) ⇒ max p i , c i ( u i ( c i ) − e i ( p i )) p i , c i i i So we can simply optimize production and consumption of each individual independently Algorithm 1 tˆ atonnement 1: procedure FindBestPrices λ ← [0 , 0 , · · · , 0] 2: for k = 1 , 2 , · · · do 3: Each individual solves for its c i and p i for the given λ 4: � λ ← λ + δ k i ( c i − p i ) 5: S. Cheng (OU-Tulsa) October 3, 2017 4 / 22

  5. Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

  6. Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). Note that � f ( x ) if g ( x ) = 0 ˜ f ( x ) = −∞ otherwise S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

  7. Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). Note that � f ( x ) if g ( x ) = 0 ˜ f ( x ) = −∞ otherwise Therefore, the problem is identical to max x ˜ f ( x ) or max min λ ( f ( x ) − λ g ( x )) , x where λ is known to be the Lagrange multiplier. S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

  8. Lecture 7 Constraint optimization Lagrange multiplier (con’t) Assume the optimum is a saddle point, max min λ ( f ( x ) − λ g ( x )) = min λ max x ( f ( x ) − λ g ( x )) , x the R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) S. Cheng (OU-Tulsa) October 3, 2017 6 / 22

  9. Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

  10. Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), note that � f ( x ) if g ( x ) ≤ 0 ˜ f ( x ) = −∞ otherwise S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

  11. Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), note that � f ( x ) if g ( x ) ≤ 0 ˜ f ( x ) = −∞ otherwise Therefore, we can rewrite the problem as max min λ ≥ 0 ( f ( x ) − λ g ( x )) x S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

  12. Lecture 7 Constraint optimization Inequality constraint (con’t) Assume max min λ ≥ 0 ( f ( x ) − λ g ( x )) = min λ ≥ 0 max x ( f ( x ) − λ g ( x )) x The R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) S. Cheng (OU-Tulsa) October 3, 2017 8 / 22

  13. Lecture 7 Constraint optimization Inequality constraint (con’t) Assume max min λ ≥ 0 ( f ( x ) − λ g ( x )) = min λ ≥ 0 max x ( f ( x ) − λ g ( x )) x The R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) Moreover, at the optimum point ( x ∗ , λ ∗ ), we should have the so-called “complementary slackness” condition λ ∗ g ( x ∗ ) = 0 since max f ( x ) ≡ max min λ ≥ 0 ( f ( x ) − λ g ( x )) x x g ( x ) ≤ 0 S. Cheng (OU-Tulsa) October 3, 2017 8 / 22

  14. Lecture 7 Constraint optimization Karush-Kuhn-Tucker conditions Problem max f ( x ) x g ( x ) ≤ 0 , h ( x ) = 0 Conditions ∇ f ( x ∗ ) − µ ∗ ∇ g ( x ∗ ) − λ ∗ ∇ h ( x ∗ ) = 0 g ( x ∗ ) ≤ 0 h ( x ∗ ) = 0 µ ∗ ≥ 0 µ ∗ g ( x ∗ ) = 0 S. Cheng (OU-Tulsa) October 3, 2017 9 / 22

  15. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  16. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  17. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  18. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  19. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · Denote the lengths of x 1 , x 2 , · · · as l ( x 1 ) , l ( x 2 ) , · · · , one of the major goal is to have E [ l ( X )] to be as small as possible S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  20. Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · Denote the lengths of x 1 , x 2 , · · · as l ( x 1 ) , l ( x 2 ) , · · · , one of the major goal is to have E [ l ( X )] to be as small as possible However, we want to make sure that we can losslessly decode the message also! S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

  21. Lecture 7 Overview of source coding Uniquely decodable code To ensure that we can recover message without loss, we must make sure that no message share the same codeword S. Cheng (OU-Tulsa) October 3, 2017 11 / 22

  22. Lecture 7 Overview of source coding Uniquely decodable code To ensure that we can recover message without loss, we must make sure that no message share the same codeword We say a code is “singular” (broken) if c ( x 1 ) = c ( x 2 ) for some different x 1 and x 2 S. Cheng (OU-Tulsa) October 3, 2017 11 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend