Information Complexity and Applications Mark Braverman Princeton - PowerPoint PPT Presentation

Information Complexity and Applications Mark Braverman Princeton University and IAS FoCM’17 July 17, 2017

Coding vs complexity: a tale of two theories Coding Computational Complexity Goal: data transmission Goal: computation Different channels Models of computation “Big” questions are “Big” questions are answered with theorems conjectures “ 𝐶𝑇𝐷 1/3 can transmit ≈ “One day, we’ll prove EXP requires > 𝑜 3 𝑂𝐵𝑂𝐸 gates” 0.052 trits per application”

A key difference • Information theory is a very effective language: fits many coding situations perfectly • Shannon’s channel coding theory is “continuous”: – Turn the channel into a continuous resource; – Separate the communication channel from how it is used 3

Theory of computation is “discrete” • Von Neumann (~1948): “…Thus formal logic is, by the nature of its approach, cut off from the best cultivated portions of mathematics, and forced onto the most difficult part of the mathematical terrain, into combinatorics. The theory of automata, … will have to share this unattractive property of formal logic. It will have to be, from the mathematical point of view, combinatorial rather than analytical. ” 4

Overview • Today: Will discuss the extension of the information language to apply to problems in complexity theory. 5

Background: Shannon’s entropy • Assume a lossless binary channel. • A message 𝑌 is distributed according to some prior 𝜈 . • The inherent amount of bits it takes to transmit 𝑌 is given by its entropy 𝐼 𝑌 = 𝜈 𝑌 = 𝑦 log 2 (1/𝜈[𝑌 = 𝑦]) . 𝑌 ∼ 𝜈 communication channel 6 B A

Shannon’s Noiseless Coding Theorem • The cost of communicating many copies of 𝑌 scales as 𝐼(𝑌) . • Shannon’s source coding theorem: – Let 𝐷 𝑜 𝑌 be the cost of transmitting 𝑜 independent copies of 𝑌 . Then the amortized transmission cost 𝑜→∞ 𝐷 𝑜 (𝑌)/𝑜 = 𝐼 𝑌 . lim • Operationalizes 𝐼 𝑌 . 7

𝐼(𝑌) is nicer than 𝐷 𝑜 (𝑌) • Sending a uniform trit 𝑈 in {1,2,3} . • Using the prefix-free encoding {0,10,11} sending on trit 𝑈 1 costs 𝐷 1 = 5/3 ≈ 1.667 bits. 29 • Sending two trits (𝑈 1 𝑈 2 ) costs 𝐷 2 = 9 bits using the encoding {000,001,010,011,100,101,110,1110,1111} . The cost per trit is 29/18 ≈ 1.611 < 𝐷 1 . • 𝐷 1 + 𝐷 1 ≠ 𝐷 2 . 8

𝐼(𝑌) is nicer than 𝐷 𝑜 (𝑌) 15 29 • 𝐷 1 = 9 , 𝐷 2 = 9 • 𝐷 1 + 𝐷 1 ≠ 𝐷 2 . • The entropy 𝐼(𝑈) = log 2 3 ≈ 1.585 . • We have 𝐼 𝑈 1 𝑈 2 = log 2 9 = 𝐼 𝑈 1 + 𝐼(𝑈 2 ) . • 𝐼 𝑈 is additive over independent variables. • 𝐷 𝑜 = 𝑜 ⋅ log 2 3 ± 𝑝(𝑜). 9

Today • We will discuss generalizing information and coding theory to interactive computation scenarios: “using interaction over a channel to solve a computational problem” • In Computer Science, the amount of communication needed to solve a problem is studied by the area of communication complexity. 10

Communication complexity [Yao’79] • Considers functionalities requiring interactive computation. • Focus on the two party setting first. A & B implement a X Y functionality F(X,Y). F(X,Y) A B e.g. F(X,Y) = “X=Y?” 11

Communication complexity Goal: implement a functionality 𝐺(𝑌, 𝑍) . A protocol 𝜌(𝑌, 𝑍) computing 𝐺(𝑌, 𝑍) : Shared randomness R m 1 (X,R) Y X m 2 (Y,m 1 ,R) m 3 (X,m 1 ,m 2 ,R) A B F(X,Y) Communication cost 𝐷𝐷 𝜌 = #of bits exchanged.

Communication complexity • (Distributional) communication complexity with input distribution 𝜈 and error 𝜁 : 𝐷𝐷 𝐺, 𝜈, 𝜁 . Error ≤ 𝜁 w.r.t. 𝜈 : 𝐷𝐷 𝐺, 𝜈, 𝜁 ≔ min ≤𝜁 𝐷𝐷(𝜌) 𝜌:𝜈 𝜌 𝑌,𝑍 ≠𝐺 𝑌,𝑍 • (Randomized/worst-case) communication complexity: 𝐷𝐷(𝐺, 𝜁) . Error ≤ 𝜁 on all inputs. • Yao’s minimax: 𝐷𝐷 𝐺, 𝜁 = max 𝐷𝐷(𝐺, 𝜈, 𝜁) . 𝜈 13

A tool for unconditional lower bounds about computation • Streaming; • Data structures; • Distributed computing; • VLSI design lower bounds; • Circuit complexity; • One of two main tools for unconditional lower bounds. • Connections to other problems in complexity theory (e.g. hardness amplification). 14

Set disjointness and intersection Alice and Bob each given a set 𝑌 ⊆ 1, … , 𝑜 , 𝑍 ⊆ {1, … , 𝑜} (can be viewed as vectors in 0,1 𝑜 ). • Intersection 𝐽𝑜𝑢 𝑜 𝑌, 𝑍 = 𝑌 ∩ 𝑍 . • Disjointness 𝐸𝑗𝑡𝑘 𝑜 𝑌, 𝑍 = 1 if 𝑌 ∩ 𝑍 = ∅ , and 0 otherwise • A non-trivial theorem [Kalyanasundaram- Schnitger’87 , Razborov’92] : 𝐷𝐷 𝐸𝑗𝑡𝑘 𝑜 , 1/4 = Ω(𝑜) . • Exercise: Solve 𝐸𝑗𝑡𝑘 𝑜 with error → 0 (say, 1/𝑜 ) in 0.9𝑜 bits of communication. Can you do 0.6𝑜 ? 0.4𝑜 ?

Direct sum • 𝐽𝑜𝑢 𝑜 is just 𝑜 times 2 -bit 𝐵𝑂𝐸 . • ¬𝐸𝑗𝑡𝑘 𝑜 is a disjunction of 2 -bit 𝐵𝑂𝐸 s. • What is the connection between the communication cost of one 𝐵𝑂𝐸 and the communication cost of 𝑜 𝐵𝑂𝐸 s? • Understanding the connection between the hardness of a problem and the hardness of its pieces. • A natural approach to lower bounds. 16

How does CC scale with copies? • 𝐷𝐷 𝐺 𝑜 , 𝜈 𝑜 , 𝜁 /𝑜 →? 𝐷𝐷 𝐺, 𝜈, 𝜁 ? Recall: • lim 𝑜→∞ 𝐷 𝑜 (𝑌)/𝑜 = 𝐼 𝑌 • Information complexity is the corresponding scaling limit for 𝐷𝐷 𝐺 𝑜 , 𝜈 𝑜 , 𝜁 /𝑜 . • Helps understand problems composed of smaller problems. 17

Interactive information complexity • Information complexity :: communication complexity as • Shannon’s entropy :: transmission cost 18

Information theory in two slides • For two (potentially correlated) variables 𝑌, 𝑍 , the conditional entropy of 𝑌 given 𝑍 is the amount of uncertainty left in 𝑌 given 𝑍 : 𝐼 𝑌 𝑍 ≔ 𝐹 𝑧~𝑍 H X Y = y . • One can show 𝐼 𝑌𝑍 = 𝐼 𝑍 + 𝐼(𝑌|𝑍) . • This important fact is knows as the chain rule. • If 𝑌 ⊥ 𝑍 , then 𝐼 𝑌𝑍 = 𝐼 𝑌 + 𝐼 𝑍 𝑌 = 𝐼 𝑌 + 𝐼 𝑍 . 19

Mutual information • The mutual information is defined as 𝐽 𝑌; 𝑍 = 𝐼 𝑌 − 𝐼 𝑌 𝑍 = 𝐼 𝑍 − 𝐼(𝑍|𝑌) • “How much knowing 𝑌 reduce the uncertainty of 𝑍 ?” • Conditional mutual information: 𝐽 𝑌; 𝑍 𝑎 ≔ 𝐼 𝑌 𝑎 − 𝐼(𝑌|𝑍𝑎) • Simple intuitive interpretation. 20

The information cost of a protocol • Prior distribution: 𝑌, 𝑍 ∼ 𝜈 . X Y Protocol Protocol π transcript Π Depends A B on both 𝐽𝐷(𝜌, 𝜈) = 𝐽(Π; 𝑍|𝑌) + 𝐽(Π; 𝑌|𝑍) Π and 𝜈 what Alice learns about Y + what Bob learns about X

Example • 𝐺 is “𝑌 = 𝑍? ” . • 𝜈 is a distribution where 𝑌 = 𝑍 w.p. ½ and (𝑌, 𝑍) are random w.p. ½ . X Y SHA-256(X) [256 bits] X=Y? [1 bit] A B 𝐽𝐷(𝜌, 𝜈) = 𝐽(Π; 𝑍|𝑌) + 𝐽(Π; 𝑌|𝑍) ≈ 1 + 129 = 130 bits what Alice learns about Y + what Bob learns about X

The information complexity of a problem • Communication complexity: 𝐷𝐷 𝐺, 𝜈, 𝜁 ≔ min 𝐷𝐷(𝜌) . 𝜌 𝑑𝑝𝑛𝑞𝑣𝑢𝑓𝑡 𝐺 𝑥𝑗𝑢ℎ 𝑓𝑠𝑠𝑝𝑠 ≤𝜁 Needed! • Analogously: 𝐽𝐷 𝐺, 𝜈, 𝜁 ≔ inf 𝐽𝐷(𝜌, 𝜈) . 𝜌 𝑑𝑝𝑛𝑞𝑣𝑢𝑓𝑡 𝐺 𝑥𝑗𝑢ℎ 𝑓𝑠𝑠𝑝𝑠 ≤𝜁 • (Easy) fact: 𝐽𝐷 𝐺, 𝜈, 𝜁 ≤ 𝐷𝐷 𝐺, 𝜈, 𝜁 . 23

Information = amortized communication • Recall: lim 𝑜→∞ 𝐷 𝑜 (𝑌)/𝑜 = 𝐼 𝑌 Theorem: [B.- Rao’11] 𝑜→∞ 𝐷𝐷(𝐺 𝑜 , 𝜈 𝑜 , 𝜁)/𝑜 = 𝐽𝐷 𝐺, 𝜈, 𝜁 . • lim • Corollary: 𝑜→∞ 𝐷𝐷(𝐽𝑜𝑢 𝑜 , 0 + )/𝑜 = 𝐽𝐷 𝐵𝑂𝐸, 0 lim

The two-bit AND • Alice and Bob each have a bit 𝑌, 𝑍 ∈ {0,1} distributed according to some 𝜈 on 0,1 2 . • Want to compute 𝑌 ∧ 𝑍 , while revealing to each other as little as possible to each others’ inputs (w.r.t. the worst 𝜈 ). • Answer 𝐽𝐷(𝐵𝑂𝐸, 0) is a number between 1 and 2 . 25

The two-bit AND Results [B.-Garg-Pankratov- Weinstein’13 ] : • 𝐽𝐷 𝐵𝑂𝐸, 0 ≈ 1.4922 bits. • Find the value of 𝐽𝐷 𝐵𝑂𝐸, 𝜈, 0 for all priors 𝜈 and exhibit the information- theoretically optimal protocol for computing the 𝐵𝑂𝐸 of two bits. • Studying 𝐽𝐷 𝐵𝑂𝐸, 𝜈, 0 as a function ℝ +4 /ℝ + → ℝ + is a functional minimization problem subject to a family of constraints (cf. construction of harmonic functions). 26

The two-bit AND • Studying 𝐽𝐷 𝐵𝑂𝐸, 𝜈, 0 as a function ℝ +4 /ℝ + → ℝ + is a functional minimization problem subject to a family of constraints (cf. construction of harmonic functions). • We adopt a “guess and verify” strategy, although the general question of computing the information complexity of a function from its truth table is a very interesting one. 27

The optimal protocol for AND 1 𝑍 ∈ {0,1} 𝑌 ∈ {0,1} A B If X=1, A=1 If Y=1, B=1 If X=0, A=U [0,1] If Y=0, B=U [0,1] 0

Information Complexity and Applications Mark Braverman Princeton - PowerPoint PPT Presentation

Information Complexity and Applications Mark Braverman Princeton University and IAS FoCM17 July 17, 2017 Coding vs complexity: a tale of two theories Coding Computational Complexity Goal: data transmission Goal: computation Different

Information Information systems/infrastructure systems/infrastructure complexity complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya

Kolmogorov Complexity of Categories Complexity Programing Language Kolmogorov Noson S.

IN 5210 Complexity Theory Complexity Complexity: Socio-technical (Internet, globalization)

Communication Complexity Lecture 23 Computing with remote inputs 1 Communication Complexity

Computability and Complexity Lecture 9 Big-O and small-o notation Time complexity class TIME( t (

Overview CS20a: Complexity (Nov 19, 2002) Complexity definitions Space and time bounded

quiz insertion sort: worst-case time complexity? best-case time complexity? in-place?

Kicking the complexity habit Dan North @tastapod Kicking the complexity habit Dan North

Basics of Complexity Complexity = resources time space ink gates energy

Complexity of DLs RWTH Aachen 1 Germany Complexity of DLs: Overview of the Complexity of

Algorithmic Complexity Algorithmic Complexity "Algorithmic Complexity", also called

The Complexity of Wilkens Models of International Trade Complexity of Equilibria Models

Shannon entropy as leitmotiv for string model building Sven Krippendorf Workshop on Big Data

A Fast, Cheap, High-Entropy Source for IoT Devices Ben Lampert, Riad Wahby, Shane Leonard,Phil

Entropy and Shannon information Entropy and Shannon information For a random variable X with

Thermodynamic Computing 1 14 Forward Through Backwards Time by RocketBoom The 2nd Law of

A Tight Lower Bound for Entropy Flattening Yi-Hsiu Chen 1 os 1 Salil Vadhan 1 Jiapeng Zhang 2 Mika

Noisy Channel Coding: Correlated Random Variables & Communication over a Noisy Channel Toni

dra$-akiya-mpls-entropy-lsp-ping IETF 88, Vancouver,

The entanglement entropy and its universal behaviour in one dimension Benjamin Doyon Department

Information Complexity and Applications Mark Braverman Princeton - PowerPoint PPT Presentation

Information Complexity and Applications Mark Braverman Princeton University and IAS FoCM17 July 17, 2017 Coding vs complexity: a tale of two theories Coding Computational Complexity Goal: data transmission Goal: computation Different

Information Information systems/infrastructure systems/infrastructure complexity complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya

Kolmogorov Complexity of Categories Complexity Programing Language Kolmogorov Noson S.

IN 5210 Complexity Theory Complexity Complexity: Socio-technical (Internet, globalization)

Communication Complexity Lecture 23 Computing with remote inputs 1 Communication Complexity

Computability and Complexity Lecture 9 Big-O and small-o notation Time complexity class TIME( t (

Overview CS20a: Complexity (Nov 19, 2002) Complexity definitions Space and time bounded

quiz insertion sort: worst-case time complexity? best-case time complexity? in-place?

Kicking the complexity habit Dan North @tastapod Kicking the complexity habit Dan North

Basics of Complexity Complexity = resources time space ink gates energy

Complexity of DLs RWTH Aachen 1 Germany Complexity of DLs: Overview of the Complexity of

Algorithmic Complexity Algorithmic Complexity &quot;Algorithmic Complexity&quot;, also called

The Complexity of Wilkens Models of International Trade Complexity of Equilibria Models

Shannon entropy as leitmotiv for string model building Sven Krippendorf Workshop on Big Data

A Fast, Cheap, High-Entropy Source for IoT Devices Ben Lampert, Riad Wahby, Shane Leonard,Phil

Entropy and Shannon information Entropy and Shannon information For a random variable X with

Thermodynamic Computing 1 14 Forward Through Backwards Time by RocketBoom The 2nd Law of

A Tight Lower Bound for Entropy Flattening Yi-Hsiu Chen 1 os 1 Salil Vadhan 1 Jiapeng Zhang 2 Mika

Noisy Channel Coding: Correlated Random Variables &amp; Communication over a Noisy Channel Toni

dra$-akiya-mpls-entropy-lsp-ping IETF 88, Vancouver,

The entanglement entropy and its universal behaviour in one dimension Benjamin Doyon Department

Algorithmic Complexity Algorithmic Complexity "Algorithmic Complexity", also called

Noisy Channel Coding: Correlated Random Variables & Communication over a Noisy Channel Toni