Elements of a Nonstochastic Information Theory Girish Nair Dept. - PowerPoint PPT Presentation

Elements of a Nonstochastic Information Theory Girish Nair Dept. Electrical & Electronic Engineering University of Melbourne LCCC Workshop on Information and Control in Networks Lund, Sweden 17 October 2012

Random Variables in Communications In communications, unknown quantities/signals are usually modelled as random variables (rv’s) & random processes , for good reasons: Physical laws governing electronic/photonic circuit noise give rise � to well-defined distributions & random models – e.g. Gaussian thermal electronic noise, binary symmetric channels, Rayleigh thermal electronic noise, binary symmetric channels, Rayleigh fading, etc. fading, etc. Telecomm. systems usually designed to be used many times, & � each individual phone call/email/download may not be critically important... � System designer need only seek good performance in an average or expected sense - e.g. bit error rate, signal-to-noise ratio, outage probability. 2

Nonrandom Variables in Control In contrast, unknowns in control are often treated as non stochastic variables or signals Dominant disturbances are not necessarily � electronic/photonic circuit noise, & may not follow well-defined probability distributions. Safety- & mission-criticality � � Performance guarantees needed every time plant is used, not just on average.

Networked Control Networked control: combines both communications and control theories! � How may nonstochastic analogues of key probabilistic concepts like independence, Markovness and information be usefully defined?

Another Motivation: Channel Capacity The ordinary capacity C of a channel is defined as the highest block-code bit-rate that permits an arbitrarily small probability of decoding error. log F log | F | (subaddi tivity) 2 t 2 t = = I.e. C : lim supsup lim lim sup , + + t 1 t 1 ε → 0 ε → 0 t →∞ t ≥ 0 where := a finite set of input words of length F t + 1, t & the inner supremums are over all s.t. F ∀ x (0 : ) t ∈ F , t t the corresponding random channel output word (0 : ) Y t   ˆ ˆ can be mapped to an estimate (0: ) with Pr X t X (0 : ) t ≠ x (0 : ) t ≤ ε .   5

Information Capacity Shannon's Channel Coding Theorem essentially gives an information-theoretic characterization of C for stationary memoryless stochastic channels : [ ] [ ] I X (0: ); (0: ) t Y t I X (0: ); (0: ) t Y t C = supsup = lim sup t + 1 t + 1 t →∞ t ≥ 0 ( ) = supI[ (0); (0)] , X Y where I[ ; ]:=Shannon's ⋅⋅ mutual information functional, and the inner supremums are over all random input sequences (0: ). X t 6

Zero-Error Capacity In 1956, Shannon also introduced the stricter notion of zero error capacity C - , the highest block-coded bit-rate 0 that permits a probability of decoding error = 0 exactly. log F log | log | F | | t = 2 2 t t 2 2 t C = = = I.e. I.e. C : supsu : supsu p p lim sup lim sup , , 0 t + 1 t + 1 t →∞ t ≥ 0 where = a finite set of input words of length F t + 1, t & the inner supremums are over all s.t. F ∀ x (0 : ) t ∈ F , t t the corresponding channel output word (0 : ) Y t   ˆ ˆ can be mapped to an estimate (0 : ) with Pr X t X (0 : ) t ≠ x (0 : ) t = 0.   Clearly, C is (usually strictly) smaller than . C 7 0

C0 as an “Information” Capacity? Fact: C0 does not depend on the nonzero transition probabilities of the channel, and can be defined without any probability theory, in terms of the input-output graph that describes permitted channel transitions. � Q: Can we express C0 as the maximum rate of some nonstochastic information functional? 8

Outline � (Motivation) � Uncertain Variables � Taxicab Partitions & Maximin Information � Taxicab Partitions & Maximin Information � C0 via Maximin Information � Uniform LTI State Estimation over Erroneous Channels � Conclusion � Extension & Future Work 9

The Uncertain Variable Framework Similar to probability theory, let an uncertain variable (uv) be a mapping X � from some sample space Ω to a space X . E.g., each ω є Ω may represent a particular combination of disturbances & � inputs entering a system, & X may represent an output/state variable For any particular ω , the value x=X( ω ) is realised. � X Ω Ω Ω Ω ω x Unlike prob. theory, assume no σ -algebra or measure on Ω . 10

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Ranges Ranges As in prob. theory, the As in prob theory the -argument will often be omitted argument will often be omitted. Marginal range : ( ) : . X X X Joint range Joint range , : : ( ), ( ) : ( ) ( ) : . X Y X Y X X Y Y X X Y Y Conditional range | : ( ) : ( ) , . X y X Y y X In the absence of statistical structure, the joint range completely characterises the relationship between uv's & . X Y As , | { }, X Y X y y y y Y the joint range can be determine d from the conditional & marginal ranges, similar to the relationship between joint, conditional & marginal probability distributions. 12 12

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � Unrelatedness X Y , called unrelated if X Y , X Y , or equivalently if or equivalently if X y | X , y Y . Parallels the definition of mutual independence for rv's. Called related if X , , Y X Y � � � � , without equality. � � � � , q y 13

� � � � � � � � � � �� Nonstochastic Entropy The a priori uncertainty associated with a uv is captured by X Hartley entropy Hartley entropy H [ ]: log H [ ]: log X X X X [0 [0, ]. ] 0 2 Continuous-valued uv's yield H [ ] X � � . 0 n n For uv's with Lebesgue measurable range in For uv's with Lebesgue-measurable range in , the 0- th order Re nyi differential entropy h [ ]: log X X [ , ] 0 2 is more useful is more useful. 18

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Nonstochastic Information – Nonstochastic Information – Previous Definitions H. Shingin & Y. Ohta, NecSys09: X X inf log , X discrete-valued 2 X y | y Y I [ ; ]: [ X Y ] . 0 0 � � X inf log , X continuous-valued 2 X y | y Y (expressed in the uv framework here) G. Klir, 2006: H X H Y H X Y , , X Y , finite-valued 0 0 0 T[ ; ]: T [ ; ]:= X Y X Y . . n Something complex, ( , ) cont.-valued w. convex range X Y 19

Comments on Comments on Previous Definitions � Each gives different treatments of continuous & di discrete-valued variables. t l d i bl � Klir’s information has natural properties, but is purely axiomatic. No demonstrated relevance to problems in communications or control. � Shingin & Ohta’s information: inherently asymmetric, but shown to be useful for studying y , y g control over errorless digital channels. 20

� � � � � � � � � � � � � � � � � � � � Taxicab Connectivity A pair of points ( , ), ( ', ') x y x y X Y , is called taxicab connected, n denoted ( , ) denoted ( x y x y ) ( ' ( , x y x y ') if a finite sequence ( ), if a finite sequence ( , x y x y ) ) in in X Y X Y , i i i 1 i) beginning from ( , x y ) ( , ), x y 1 1 ii) ending in ( ii) ending in ( x y x y , ) ) ( , ( ' x y x y ') ), n n iii) and w ith each point in the sequence differing in at most one coordinate from its predecessor. from its predecessor Every point in this sequence must yield the Every point in this sequence must yield the same z same z value -value as its predecessor, since it has either the same - o x r -coordinate. y By induction, ( , )& ( , By induction ( x y x y )& ( ' x y x y ') yield the same -value ) yield the same -value. z z 22

� � � Taxicab Connectedness Taxicab Connectedness Examples Examples ([[ X,Y ]] = shaded area) y y y x x x ( , ) x y ( ', '), x y also disconnected in usual sense l di t d i l ( , ) x y ( ', '), x y ( , ) x y ( ', '), x y but disconnected in usual sense. but connected in usual sense. 23 23 23

Elements of a Nonstochastic Information Theory Girish Nair Dept. - PowerPoint PPT Presentation

Elements of a Nonstochastic Information Theory Girish Nair Dept. Electrical & Electronic Engineering University of Melbourne LCCC Workshop on Information and Control in Networks Lund, Sweden 17 October 2012 Random Variables in

The Nonstochastic Multi Armed Bandit Problem Part 2 and counting... Shahaf Nacson TAU Nov 15,

Nonstochastic Information for Worst-Case Networked Estimation and Control Girish Nair Department

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part 2 S ebastien

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I S ebastien

Modelling nonstationary signals using stochastic and nonstochastic approach Jacek Lekow

Delay and Cooperation in Nonstochastic Bandits Nicol` o Cesa-Bianchi Universit` a degli Studi

Elements of Future COP Elements of Future COP Elements of Future COP Elements of Future COP

Living organisms are composed of about 25 chemical elements Most Common Elements In the Human

Recap of Basic Probability Elements of basic probability theory probability theory The

Literary Elements OB OBJE JECT CTIVES IVES Identify elements of a short story Define

The Periodic Table True or false ? 1. There are over 1000 elements. 2. All elements

OF ELEMENTS -Rishi INTRODUCTION: There are 114 elements known at present and it is very

Chapter 16 Chapter 16 The Elements: The he Elements: The d -Block -Block The d -Block

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Elements of Game Theory S. Pinchinat Master2 RI 2011-2012 S. Pinchinat (IRISA) Elements of Game

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? ! What does a theory consist of?

Intrabody Communication: Applications and Practical Issues Kurt Partridge University of

A Brief Introduction to Graphical Models and How to Learn Them from Data Christian Borgelt Dept.

CSE 421 Algorithms Summer 2007 Huffman Codes: An Optimal Data Compression Method 1 a 45% b

CSE 417 c 12% Compression Example d 16% e 9% Algorithms f 5% 100k file, 6 letter

Wireless Multimedia System (Topic 5) Wireless Link I: Multiple

Prof Anita Heis iss Wiradjuri Nation Welcoming Cities, Brisbane, 2019 Gadigal /Eora Cowra

Machine Learning Lecture 01-2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk

Planning and Optimization B7. Symbolic Search: Binary Decision Diagrams Malte Helmert and Thomas