Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( - PowerPoint PPT Presentation

Conjugate prior summary Distribution Likelihood p ( x | θ ) Prior p ( θ ) Distribution (1 − θ ) (1 − x ) θ x ∝ (1 − θ ) ( a − 1) θ ( b − 1) Bernoulli Beta ∝ (1 − θ ) ( N − x ) θ x ∝ (1 − θ ) ( a − 1) θ ( b − 1) Binomial Beta ∝ θ x 1 1 θ x 2 2 θ x 3 ∝ θ α 1 − 1 θ α 2 − 1 θ α 3 − 1 Multinomial Dirichlet 3 1 2 3 Normal � − ( x − θ ) 2 � � − ( θ − µ 0 ) 2 � ∝ exp ∝ exp Normal (fixed σ 2 ) 2 σ 2 2 σ 2 0 √ Normal � − θ ( x − µ ) 2 � ∝ θ a − 1 exp ( − b θ ) ∝ θ exp Gamma 2 (fixed µ ) ∝ θ x exp( − θ ) ∝ θ a − 1 exp ( − b θ ) Poisson Gamma S. Cheng (OU-Tulsa) October 3, 2017 1 / 22

Lecture 7 Constraint optimization An example Simple economy: m prosumers, n different goods 1 Each individual: production p i ∈ R n , consumption c i ∈ R n Expense of producing “ p ” for agent i = e i ( p ) Utility (happiness) of consuming “ c ” units for agent i = u i ( c ) Maximize happiness � � � max ( u i ( c i ) − e i ( p i )) c i = p i s . t . p i , c i i i i 1 Example borrowed from the first lecture of Prof Gordon’s CMU CS 10-725 S. Cheng (OU-Tulsa) October 3, 2017 2 / 22

Lecture 7 Constraint optimization Walrasian equilibrium � � � max ( u i ( c i ) − e i ( p i )) c i = p i s . t . p i , c i i i i Idea: introduce price λ j to each good j . Let the market decide Price λ j ↑ : consumption of good j ↓ , production of good j ↑ Price λ j ↓ : consumption of good j ↑ , production of good j ↓ Can adjust price until consumption = production for each good S. Cheng (OU-Tulsa) October 3, 2017 3 / 22

Lecture 7 Constraint optimization Algorithm: tˆ atonnement Assume that the appropriate prices are found, we can ignore the equality constraint, then the problem becomes � � max ( u i ( c i ) − e i ( p i )) ⇒ max p i , c i ( u i ( c i ) − e i ( p i )) p i , c i i i So we can simply optimize production and consumption of each individual independently Algorithm 1 tˆ atonnement 1: procedure FindBestPrices λ ← [0 , 0 , · · · , 0] 2: for k = 1 , 2 , · · · do 3: Each individual solves for its c i and p i for the given λ 4: � λ ← λ + δ k i ( c i − p i ) 5: S. Cheng (OU-Tulsa) October 3, 2017 4 / 22

Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). Note that � f ( x ) if g ( x ) = 0 ˜ f ( x ) = −∞ otherwise S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

Lecture 7 Constraint optimization Lagrange multiplier Problem max f ( x ) x g ( x ) = 0 Consider L ( x , λ ) = f ( x ) − λ g ( x ) and let ˜ f ( x ) = min λ L ( x , λ ). Note that � f ( x ) if g ( x ) = 0 ˜ f ( x ) = −∞ otherwise Therefore, the problem is identical to max x ˜ f ( x ) or max min λ ( f ( x ) − λ g ( x )) , x where λ is known to be the Lagrange multiplier. S. Cheng (OU-Tulsa) October 3, 2017 5 / 22

Lecture 7 Constraint optimization Lagrange multiplier (con’t) Assume the optimum is a saddle point, max min λ ( f ( x ) − λ g ( x )) = min λ max x ( f ( x ) − λ g ( x )) , x the R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) S. Cheng (OU-Tulsa) October 3, 2017 6 / 22

Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), note that � f ( x ) if g ( x ) ≤ 0 ˜ f ( x ) = −∞ otherwise S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

Lecture 7 Constraint optimization Inequality constraint Problem max f ( x ) x g ( x ) ≤ 0 Consider ˜ f ( x ) = min λ ≥ 0 ( f ( x ) − λ g ( x )), note that � f ( x ) if g ( x ) ≤ 0 ˜ f ( x ) = −∞ otherwise Therefore, we can rewrite the problem as max min λ ≥ 0 ( f ( x ) − λ g ( x )) x S. Cheng (OU-Tulsa) October 3, 2017 7 / 22

Lecture 7 Constraint optimization Inequality constraint (con’t) Assume max min λ ≥ 0 ( f ( x ) − λ g ( x )) = min λ ≥ 0 max x ( f ( x ) − λ g ( x )) x The R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) S. Cheng (OU-Tulsa) October 3, 2017 8 / 22

Lecture 7 Constraint optimization Inequality constraint (con’t) Assume max min λ ≥ 0 ( f ( x ) − λ g ( x )) = min λ ≥ 0 max x ( f ( x ) − λ g ( x )) x The R.H.S. implies ∇ f ( x ) = λ ∇ g ( x ) Moreover, at the optimum point ( x ∗ , λ ∗ ), we should have the so-called “complementary slackness” condition λ ∗ g ( x ∗ ) = 0 since max f ( x ) ≡ max min λ ≥ 0 ( f ( x ) − λ g ( x )) x x g ( x ) ≤ 0 S. Cheng (OU-Tulsa) October 3, 2017 8 / 22

Lecture 7 Constraint optimization Karush-Kuhn-Tucker conditions Problem max f ( x ) x g ( x ) ≤ 0 , h ( x ) = 0 Conditions ∇ f ( x ∗ ) − µ ∗ ∇ g ( x ∗ ) − λ ∗ ∇ h ( x ∗ ) = 0 g ( x ∗ ) ≤ 0 h ( x ∗ ) = 0 µ ∗ ≥ 0 µ ∗ g ( x ∗ ) = 0 S. Cheng (OU-Tulsa) October 3, 2017 9 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · Denote the lengths of x 1 , x 2 , · · · as l ( x 1 ) , l ( x 2 ) , · · · , one of the major goal is to have E [ l ( X )] to be as small as possible S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Overview of source coding The objective of “source coding” is to compress some source We can think of compression as “coding”. Meaning that we replace each input by a corresponding coded sequence. So encoding is just a mapping/function process Without loss of generality, we can use binary domain for our coded sequence. So for each input message, it is converted to a sequence of 1s and 0s Consider encoding (compressing) a sequence x 1 , x 2 , · · · one symbol at a time, resulting c ( x 1 ) , c ( x 2 ) , · · · Denote the lengths of x 1 , x 2 , · · · as l ( x 1 ) , l ( x 2 ) , · · · , one of the major goal is to have E [ l ( X )] to be as small as possible However, we want to make sure that we can losslessly decode the message also! S. Cheng (OU-Tulsa) October 3, 2017 10 / 22

Lecture 7 Overview of source coding Uniquely decodable code To ensure that we can recover message without loss, we must make sure that no message share the same codeword S. Cheng (OU-Tulsa) October 3, 2017 11 / 22

Lecture 7 Overview of source coding Uniquely decodable code To ensure that we can recover message without loss, we must make sure that no message share the same codeword We say a code is “singular” (broken) if c ( x 1 ) = c ( x 2 ) for some different x 1 and x 2 S. Cheng (OU-Tulsa) October 3, 2017 11 / 22

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( - PowerPoint PPT Presentation

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( ) Distribution (1 ) (1 x ) x (1 ) ( a 1) ( b 1) Bernoulli Beta (1 ) ( N x ) x (1 ) ( a 1) ( b

Choosing Priors Probability Intervals 18.05 Spring 2014 Conjugate priors A prior is conjugate

Choosing Priors Probability Intervals 18.05 Spring 2014 January 1, 2017 1 /25 Conjugate

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

Determining Weakly Reversible Linearly Conjugate Chemical Reaction Networks with Minimal

Tracking Perform ance of the MMax Conjugate Gradient Algorithm Bei Xie and Tam al Bose

Conjugate Directions Powells method is based on a model quadratic objective function and

Formal proof of SCHUR conjugate function Toumazet Objectives and tools SCHUR Micaela Mayero 1

Conjugate Phase Retrieval in the Paley-Wiener Space Eric Weber Iowa State University CodEx

Conjugate Direction minimization Lectures for PHD course on Numerical optimization Enrico

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

CS 5220: More Sparse LA David Bindel 2017-10-26 1 Reminder: Conjugate Gradients What if we only

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal 18.05 Spring 2018 Review: Continuous priors, discrete data

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /20 Review: Continuous

2. Information Theory -Channel Capacity Ying Cui Department of Electronic Engineering Shanghai

Lecture 2 Lossless Source Coding I-Hsiang Wang Department of Electrical Engineering National

Foundations of Computing II Lecture 16: Information Theory and Data Compression Stefano Tessaro

Hypergraph-based Coding Schemes for Two Source Coding Problems under Maximal Distortion Sourya

The Simple English guide to human-generated secrets Computers try to tell humans apart by asking

PriSec Research Group Datavetenskap, Karlstads universitet Christer Andersson , Reine Lundin On

Katalin Marton Abbas El Gamal Stanford University Withits 2010 A. El Gamal (Stanford

Integer-Forcing for Channels, Sources and ADCs Or Ordentlich Tel Aviv University Or Ordentlich