Information Theory on Convex sets In celebration of Prof. Shunichi - PowerPoint PPT Presentation

Information Theory on Convex sets In celebration of Prof. Shun’ichi Amari’s 80 years birthday Peter Harremo¨ es Copenhagen Business College June 2016 Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 1 / 32

Outline Introduction. Convex sets and decompositions into extreme points. Spectral convex sets. Bregman divergences for convex optimization. Sufficiency and locality. Reversibility. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 2 / 32

Some major questions Is information theory mainly a theory about sequenses? Is it possible to apply thermodynamic ideas to systems without conservation of energy? Why do information theoretic concepts appear in statistics, physics and finance? How important is the notion of reversibility to our theories? Why are complex Hilbert spaces so useful for representations of quantum systems? Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 3 / 32

Color diagram Nice but wrong! Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 4 / 32

Color vision The human eye senses color using the cones. Rods are not used for color but for periferical vision and night vision. Primates have three 3 receptors. Most mammels have 2 color receptors. Birds and reptiles have 4 color receptors. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 5 / 32

Example of state space: Chromaticity diagram Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 6 / 32

Black body radiation Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 7 / 32

VGA screen Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 8 / 32

The state space Before we do anyting we prepare our system. Let P denote the set of preparations. Let p 0 and p 1 denote two preparations. For t ∈ [ 0, 1 ] we define ( 1 − t ) · p 0 + t · p 1 as the preparation obtained by preparing p 0 with probability 1 − t and t with probability t. A measurement m is defined as an affine mapping of the set of preparations into a set of probability measures on a measurable space. Let M denote a set of feasible measurements. The state space S is defined as the set of preparations modulo measurements. Thus, if p 1 and p 2 are preparations then they represent the same state if m ( p 1 ) = m ( p 2 ) for any m ∈ M . Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 9 / 32

The state space Often the state space equals the set of preparations and has the shape of a simplex. In quantum theory the state space has the shape of the density matrices on a complex Hilbert space. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 10 / 32

Example: Bloch sphere A qubit can be described by a density matrix of the form � � 1 2 + x y + iz 1 y − iz 2 − x where x 2 + y 2 + z 2 ≤ 1 / 4 . The pure states are the states on the boundary. The mixed states are all interior points of the ball. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 11 / 32

Orthogonal states We say that two states s 0 and s 1 are mutually singular if there exists a measurement m with values in [ 0, 1 ] such that m ( s 0 ) = 0 and m ( s 1 ) = 1. We say that s 0 and s 1 are orthogonal if there exists a face F ⊆ S such that s 0 and s 1 are mutually singular as elements of F . Lemma Any state that is algebraically interior in the state space can be written as a mixture of two mutually singular states. Proof Use Borsuk–Ulam theorem from topology. Improved Caratheodory Theorem In a state space of dimension d any state can be written as a mixture of at most d + 1 orthogonal states. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 12 / 32

Entropy of a state Let s denote a state. Then the entropy of s cen be defined as � � � H ( s ) = inf − p i · ln p i i where the infimum is taken over all probability vectors ( p 1, p 2 , . . . ) such that there exists states s 1 , s 2 , . . . that are extreme points such that � s = p i · s i . i According to Caratheodory’s theorem H ( s ) ≤ ln ( d + 1 ) when the state space has dimension d . We define the entropy of a state space S as sup s ∈ S H ( s ) where the supremum is taken over all states in the state space. We define the spectral dimension of the state space S as exp ( H ( S )) − 1. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 13 / 32

Entropic proof d � H ( s ) = − p i · ln ( p i ) i = 0 p 0 p 0 p 1 p 1 � � � � �� = ( p 0 + p 1 ) − ln − ln p 0 + p 1 p 0 + p 1 p 0 + p 1 p 0 + p 1 d � − ( p 0 + p 1 ) ln ( p 0 + p 1 ) − p i · ln ( p i ) i = 2 and d � s = p i · s i i = 0 d p 0 p 1 � � � = ( p 0 + p 1 ) · s 0 + + · s 2 p i · s i . p 0 + p 1 p 0 + p 1 i = 2 Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 14 / 32

Spectral sets Definition If p 0 ≤ p 1 ≤ p 2 · · · ≤ p d and s = � d i = 0 p i · s i where s i are orthogonal we say that the vector p d 0 is a spectrum of s . We say that s is a spectral state if s has a unique spectrum. We say that the convex compact set C is spectral if all states in C are spectral. Theorem For a spectral set the entropic dimension equals the maximal number of orthogonal states minus one. Proof. Assume that the maximal number of orthogonal states is n . Any state can be written as a mixture of n states, and a mixture of at n states has entropy at most ln ( n ) . The uniform distribution on n states has entropy ln ( n ) . Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 15 / 32

Examples of spectral sets A simplex. A d-dimensional ball. Density matrices over the real numbers. Density matrices over the complex numbers. Density matrices over the quaternions. Density matrices in Von Neuman algebras. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 16 / 32

Actions Let A denote a subset of the feasiable measurements M such that a ∈ A maps S into a distribution on R i.e. a random variable. The elements of A should represent actions like * The score of a statistical decision. * The energy extracted by a certain interaction with the system. * (Minus) the lenth of a codeword of the next encoded input letter using a specific code book. * The revenue of using a certain portfolio. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 17 / 32

Optimization For each s ∈ S we define � a , s � = E [ a ( s )] . and F ( s ) = sup � a , s � . a ∈A Without loss of generality we may assume that the set of actions A is closed so that we may assume that there exists a ∈ A such that F ( s ) = � a , s � and in this case we say that a is optimal for s . We note that F is convex but F need not be strictly convex. Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 18 / 32

Regret Definition If F ( s ) is finite the regret of the action a is defined by D F ( s , a ) = F ( s ) − � a , s � The regret D F has the following properties: D F ( s , a ) ≥ 0 with equality if a is optimal for s . s = � t i · s i where ( t 1 , t 2 , . . . , t ℓ ) is a If ¯ a is optimal for the state ¯ probability vector then � � t i · D F ( s i , a ) = t i · D F ( s i , ¯ a ) + D F ( ¯ s , a ) . � t i · D F ( s i , a ) is minimal if a is optimal for ¯ s = � t i · s i . Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 19 / 32

Bregman divergence Definition If F ( s 1 ) is finite the regret of the state s 2 is defined as D F ( s 1 , s 2 ) = inf a D F ( s 1 , a ) (1) where the infimum is taken over actions a that are optimal for s 2 . If the state s 2 has the unique optimal action a 2 then F ( s 1 ) = D F ( s 1 , s 2 ) + � a 2 , s 1 � so the function F can be reconstructed from D F except for an affine function of s 1 . The closure of the convex hull of the set of functions s → � a , s � is uniquely determined by the convex function F . The regret is called a Bregman divergence if it can be written in the following form D F ( s 1 , s 2 ) = F ( s 1 ) − ( F ( s 2 ) + ( s 1 − s 2 ) · ∇ F ( s 2 )) . Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 20 / 32

Properties of Bregman divergences The Bregman divergence has the following properties: d ( s 1 , s 2 ) ≥ 0 d ( s 1 , s 2 ) = a 2 ( s 1 ) − a 2 ( s 2 ) where a 2 denotes the action for which F ( s 2 ) = a ( s 2 ) . � t i · d ( s i , ˜ s ) = � t i · d ( s i , ˆ s = � t i · s i . s ) + d ( ˜ s ) where ˆ s , ˆ � t i · d ( s i , ˜ s = � t i · s i . s ) is minimal when ˆ Peter Harremo¨ es (Copenhagen Business College) Information Theory on Convex sets June 2016 21 / 32

Information Theory on Convex sets In celebration of Prof. Shunichi - PowerPoint PPT Presentation

Information Theory on Convex sets In celebration of Prof. Shunichi Amaris 80 years birthday Peter Harremo es Copenhagen Business College June 2016 Peter Harremo es (Copenhagen Business College) Information Theory on Convex sets

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

CS675: Convex and Combinatorial Optimization Fall 2019 Geometric Duality of Convex Sets and

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

2. Convex sets affine and convex sets some important examples operations that preserve

14. Convex programming Convex sets and functions Convex programs Hierarchy of

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Convex Optimization 2. Convex Sets Prof. Ying Cui Department of Electrical Engineering Shanghai

4. Convex Sets and (Quasi-)Concave Functions Daisuke Oyama Mathematics II April 17, 2020 Convex

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Performance Measures: Stochastic Optimization & Statistical Consistency Harikrishna Narasimhan

MDL and the complexity of natural language John Goldsmith University of Chicago/CNRS MoDyCo

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of

Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Module 8 Professional Written Communication Module Eight: Professional Written Communication 1

ARTICLES QUANTIFIERS POSSESSIVE DEMONSTRATIVE 1 Semicolons.notebook May 02, 2020 ARTICLES

Users Manual as a Requirements Specification Daniel M. Berry, 1991, 1998, 2002, and 2003

Information Theory on Convex sets In celebration of Prof. Shunichi - PowerPoint PPT Presentation

Information Theory on Convex sets In celebration of Prof. Shunichi Amaris 80 years birthday Peter Harremo es Copenhagen Business College June 2016 Peter Harremo es (Copenhagen Business College) Information Theory on Convex sets

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

CS675: Convex and Combinatorial Optimization Fall 2019 Geometric Duality of Convex Sets and

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

2. Convex sets affine and convex sets some important examples operations that preserve

14. Convex programming Convex sets and functions Convex programs Hierarchy of

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Convex Optimization 2. Convex Sets Prof. Ying Cui Department of Electrical Engineering Shanghai

4. Convex Sets and (Quasi-)Concave Functions Daisuke Oyama Mathematics II April 17, 2020 Convex

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Performance Measures: Stochastic Optimization &amp; Statistical Consistency Harikrishna Narasimhan

MDL and the complexity of natural language John Goldsmith University of Chicago/CNRS MoDyCo

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of

Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Module 8 Professional Written Communication Module Eight: Professional Written Communication 1

ARTICLES QUANTIFIERS POSSESSIVE DEMONSTRATIVE 1 Semicolons.notebook May 02, 2020 ARTICLES

Users Manual as a Requirements Specification Daniel M. Berry, 1991, 1998, 2002, and 2003

Performance Measures: Stochastic Optimization & Statistical Consistency Harikrishna Narasimhan