maximum entropy beyond
play

Maximum Entropy Beyond Fact to Explain Selecting Probability - PowerPoint PPT Presentation

Need to Select a . . . Maximum Entropy . . . Simple Examples of . . . A Natural Question Maximum Entropy Beyond Fact to Explain Selecting Probability Maximum Entropy . . . Explaining a Value: . . . Distributions Explaining a . . . Need


  1. Need to Select a . . . Maximum Entropy . . . Simple Examples of . . . A Natural Question Maximum Entropy Beyond Fact to Explain Selecting Probability Maximum Entropy . . . Explaining a Value: . . . Distributions Explaining a . . . Need for Nonlinear . . . Thach N. Nguyen 1 , Olga Kosheleva 2 , and Vladik Kreinovich 2 Home Page 1 Banking University of Ho Chi Minh City, Vietnam Title Page Thachnn@buh.edu.vn ◭◭ ◮◮ 2 University of Texas at El Paso, El Paso, Texas 79968, USA vladik@utep.edu, olgak@utep.edu ◭ ◮ Page 1 of 33 Go Back Full Screen Close Quit

  2. Need to Select a . . . Maximum Entropy . . . 1. Need to Select a Distribution: Formulation of Simple Examples of . . . a Problem A Natural Question • Many data processing techniques assume that we know Fact to Explain the probability distribution – e.g.: Maximum Entropy . . . Explaining a Value: . . . – the probability distributions of measurement er- Explaining a . . . rors, and/or Need for Nonlinear . . . – probability distributions of the signals, etc. Home Page • Often, however, we have only partial information about Title Page a probability distribution. ◭◭ ◮◮ • Then, several probability distributions are consistent ◭ ◮ with the available knowledge. Page 2 of 33 • We want to apply, to this situation: Go Back – a data processing algorithm Full Screen – which is based on the assumption that the proba- bility distribution is known. Close Quit

  3. Need to Select a . . . Maximum Entropy . . . 2. Need to Select a Distribution (cont-d) Simple Examples of . . . • We want to apply, to this situation: A Natural Question Fact to Explain – a data processing algorithm Maximum Entropy . . . – which is based on the assumption that the proba- Explaining a Value: . . . bility distribution is known. Explaining a . . . • For this, we must select a single probability distribu- Need for Nonlinear . . . tion out of all possible distributions. Home Page • How can we select such a distribution? Title Page ◭◭ ◮◮ ◭ ◮ Page 3 of 33 Go Back Full Screen Close Quit

  4. Need to Select a . . . Maximum Entropy . . . 3. Maximum Entropy Approach Simple Examples of . . . • By selecting a single distribution out of several, we A Natural Question inevitably decrease uncertainty. Fact to Explain Maximum Entropy . . . • It is reasonable to select a distribution for which this Explaining a Value: . . . decrease in uncertainty is as small as possible. Explaining a . . . • How to describe this idea as a precise optimization Need for Nonlinear . . . problem. Home Page • A natural way to measure uncertainty is by: Title Page – the average number of binary (“yes”-“no”) ques- ◭◭ ◮◮ tions that we need to ask ◭ ◮ – to uniquely determine the corresponding random Page 4 of 33 value. Go Back • In the case of continuous variables, to determine the Full Screen random value with a given accuracy ε . Close Quit

  5. Need to Select a . . . Maximum Entropy . . . 4. Maximum Entropy Approach (cont-d) Simple Examples of . . . • One can show that this average number is asymptoti- A Natural Question cally (when ε → 0) proportional to the entropy Fact to Explain Maximum Entropy . . . � def S ( ρ ) = − ρ ( x ) · ln( ρ ( x )) dx. Explaining a Value: . . . Explaining a . . . • For a class F of distributions, the average number of Need for Nonlinear . . . binary question is asymptotically proportional to Home Page max ρ ∈ F S ( ρ ) . Title Page ◭◭ ◮◮ • If we select a distribution, uncertainty decreases. ◭ ◮ • We want to select a distribution ρ 0 for which the de- Page 5 of 33 crease in uncertainty is the smallest. Go Back • We thus select a distribution ρ 0 for which the entropy Full Screen is the largest possible: S ( ρ 0 ) = max ρ ∈ F S ( ρ ). Close Quit

  6. Need to Select a . . . Maximum Entropy . . . 5. Simple Examples of Using the Maximum En- Simple Examples of . . . tropy Techniques A Natural Question • In some cases, all we know is that the random variable Fact to Explain is located somewhere on a given interval [ a, b ]. Maximum Entropy . . . � b Explaining a Value: . . . • We then maximize − a ρ ( x ) · ln( ρ ( x )) dx under the con- � b Explaining a . . . dition that a ρ ( x ) dx = 1. Need for Nonlinear . . . • Thus, we get a constraint optimization problem: opti- Home Page � b mize the entropy under the constraint a ρ ( x ) dx = 1. Title Page • To solve this constraint optimization problem, we can ◭◭ ◮◮ use the Lagrange multiplier method. ◭ ◮ • This method reduces our problem to the following un- Page 6 of 33 constrained optimization problem: � b �� b � Go Back − ρ ( x ) · ln( ρ ( x )) dx + λ · ρ ( x ) dx − 1 . Full Screen a a • Here λ is the Lagrange multiplier . Close Quit

  7. Need to Select a . . . Maximum Entropy . . . 6. Simple Examples of Using the Maximum En- Simple Examples of . . . tropy Techniques (cont-d) A Natural Question • The value λ needs to be determined so that the original Fact to Explain constraint will be satisfied. Maximum Entropy . . . Explaining a Value: . . . • We want to find the function ρ , i.e., we want to find Explaining a . . . the values ρ ( x ) corresponding to different inputs x . Need for Nonlinear . . . • Thus, the unknowns in this optimization problem are Home Page the values ρ ( x ) corresponding to different inputs x . Title Page • To solve the resulting unconstrained optimization ◭◭ ◮◮ problem, we can simply: ◭ ◮ – differentiate the above expression by each of the Page 7 of 33 unknowns ρ ( x ) and – equate the resulting derivative to 0. Go Back • As a result, we conclude that − ln( ρ ( x )) − 1 + λ = 0, Full Screen hence ln( ρ ( x )) is a constant not depending on x . Close Quit

  8. Need to Select a . . . Maximum Entropy . . . 7. Simple Examples of Using the Maximum En- Simple Examples of . . . tropy Techniques (cont-d) A Natural Question • Therefore, ρ ( x ) is a constant. Fact to Explain Maximum Entropy . . . • Thus, in this case, the Maximum Entropy technique Explaining a Value: . . . leads to a uniform distribution on the interval [ a, b ]. Explaining a . . . • This conclusion makes perfect sense: Need for Nonlinear . . . – if we have no information about which values from Home Page the interval [ a, b ] are more probable; Title Page – it is thus reasonable to conclude that all these val- ◭◭ ◮◮ ues are equally probable, i.e., that ρ ( x ) = const. ◭ ◮ • This idea goes back to Laplace and is known as the Laplace Indeterminacy Principle . Page 8 of 33 • In other situations, the only information that we have Go Back about ρ ( x ) is the first two moments Full Screen � � ( x − µ ) 2 · ρ ( x ) dx = σ 2 . x · ρ ( x ) dx = µ, Close Quit

  9. Need to Select a . . . Maximum Entropy . . . 8. Simple Examples of Using the Maximum En- Simple Examples of . . . tropy Techniques (cont-d) A Natural Question • Then, we select ρ ( x ) for which S ( ρ ) is the largest under Fact to Explain � these two constraints and ρ ( x ) dx = 1. Maximum Entropy . . . Explaining a Value: . . . • For this problem, the Lagrange multiplier methods leads to maximizing: Explaining a . . . Need for Nonlinear . . . � �� � − ρ ( x ) · ln( ρ ( x )) dx + λ 1 · x · ρ ( x ) dx − µ + Home Page Title Page �� b �� � � ( x − µ ) 2 · ρ ( x ) dx − σ 2 λ 2 · + λ 3 · ρ ( x ) dx − 1 . ◭◭ ◮◮ a ◭ ◮ • Differentiating w.r.t. ρ ( x ) and equating the derivative to 0, we conclude that Page 9 of 33 − ln( ρ ( x )) − 1 + λ 1 · x + λ 2 · ( x − µ ) 2 + λ 3 = 0 . Go Back Full Screen • So, ln( ρ ( x )) is a quadratic function of x and thus, ρ ( x ) = exp(ln( ρ ( x ))) is a Gaussian distribution. Close Quit

  10. Need to Select a . . . Maximum Entropy . . . 9. Simple Examples of Using the Maximum En- Simple Examples of . . . tropy Techniques (final) A Natural Question • This conclusion is also in good accordance with com- Fact to Explain mon sense; indeed: Maximum Entropy . . . Explaining a Value: . . . – in many cases, e.g., the measurement error results Explaining a . . . from many independent small effects and, Need for Nonlinear . . . – according to the Central Limit Theorem, the dis- Home Page tribution of such sum is close to Gaussian. Title Page • There are many other examples of a successful use of ◭◭ ◮◮ the Maximum Entropy technique. ◭ ◮ Page 10 of 33 Go Back Full Screen Close Quit

  11. Need to Select a . . . Maximum Entropy . . . 10. A Natural Question Simple Examples of . . . • Rhe Maximum Entropy technique works well for se- A Natural Question lecting a distribution. Fact to Explain Maximum Entropy . . . • Can we extend it to solving other problems? Explaining a Value: . . . • In this talk, we show, on several examples, that such Explaining a . . . an extension is indeed possible. Need for Nonlinear . . . Home Page • We will show it on case studies that cover all three types of possible problems: Title Page – explaining a fact, ◭◭ ◮◮ – finding the number, and ◭ ◮ – finding the functional dependence. Page 11 of 33 Go Back Full Screen Close Quit

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend