limits on representing functions by linear combinations
play

Limits on Representing Functions by Linear Combinations of Simple - PowerPoint PPT Presentation

Limits on Representing Functions by Linear Combinations of Simple Functions 0,1 0,1 ? simple simple simple simple simple simple Ryan Williams MIT The -linear Representation Problem Let be a class of


  1. Limits on Representing Functions by Linear Combinations of Simple Functions โˆ‘ ๐‘” โˆถ 0,1 ๐‘œ โ†’ 0,1 ? โ‰ก simple simple simple simple simple simple Ryan Williams MIT

  2. The โ„ -linear Representation Problem Let ๐““ be a class of โ€œsimpleโ€ functions (take Boolean inputs, but need not be Boolean-valued) Which โ€œinterestingโ€ functions ๐’ˆ can(not) be represented by โ€œshortโ€ โ„ -linear combinations of functions from ๐““ ? โˆ‘ ๐‘” โˆถ 0,1 ๐‘œ โ†’ 0,1 poly( ๐’ ) โ€œsizeโ€ ? โ‰ก โˆ’๐œŒ 2 ๐œš โˆ’๐‘“ Call this a โˆ‘ โˆ˜ ๐““ circuit simple simple simple simple simple simple Note: If ๐““ spans the vector space of all functions ๐’ˆ โˆถ ๐Ÿ, ๐Ÿ ๐’ โ†’ โ„ then there is always a โˆ‘ โˆ˜ ๐““ circuit of โ‰ค ๐Ÿ‘ ๐’ sizeโ€ฆ

  3. The โ„ -linear Representation Problem Which โ€œinterestingโ€ functions ๐’ˆ can(not) be represented by โ€œshortโ€ โ„ -linear combinations of functions from ๐““ ? If ๐““ is the class of ๐Ÿ‘ ๐’ ๐‘ฉ๐‘ถ๐‘ฌ functions on ๐’ variables: โˆ‘ โˆ˜ ๐‘ฉ๐‘ถ๐‘ฌ โ‰ก ๐Ÿ/๐Ÿ polynomials over โ„ If ๐““ is the class of ๐Ÿ‘ ๐’ ๐‘ธ๐‘ฉ๐‘บ๐‘ฑ๐‘ผ๐’ functions on ๐’ variables: โˆ‘ โˆ˜ ๐‘ธ๐‘ฉ๐‘บ๐‘ฑ๐‘ผ๐’ โ‰ก โˆ’๐Ÿ/๐Ÿ polynomials over โ„ (Fourier analysis of Boolean functions) These are well-understood: ๐““ is a basis for the vector space of functions ๐‘” โˆถ 0,1 ๐‘œ โ†’ โ„ โ‡’ the โ„ -linear representation of ๐’ˆ is unique, so the โ€œshortestโ€ is also the โ€œlongestโ€โ€ฆ More interesting cases: representations are not unique

  4. This Paper: Three Simple Classes 1. Linear Threshold Functions [ ๐‘ด๐‘ผ๐‘ฎ ] 2. Rectified Linear Units [ ๐‘บ๐’‡๐‘ด๐‘ฝ ] ๐‘ฏ๐‘ฎ ( ๐’’ )- Polynomials of Degree- ๐’† [ ๐‘ธ๐‘ท๐‘ด๐’๐’† ๐’’ ] 3. ( ๐’’ prime and ๐’† โ‰ฅ ๐Ÿ‘ ) For all three classes: There are โ‰ซ ๐Ÿ‘ ๐’ functions on ๐’ variables, โ€ข so โ„ -linear representations are not unique ๐Ÿ‘ ๐šฐ ๐’ ๐Ÿ‘ LTFs, ๐’’ ๐šฐ ๐’ ๐’† degree- ๐’† polys, โˆž ReLU functions โ€ข โ„ -linear Representations have been studied! โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ = Special Case of Depth-2 Threshold Circuits โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ = โ€œDepth -2 Neural Net with ReLU activationโ€ โˆ‘ โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐’†[๐’’] = โ€œHigher - Orderโ€ Fourier Analysis for ๐’† โ‰ฅ ๐Ÿ‘

  5. Sums of Linear Threshold Functions ๐‘œ : 0,1 ๐‘œ โ†’ 0,1 is an LTF if โˆƒ ๐‘ฅ 1 , โ€ฆ ๐‘ฅ ๐‘œ , ๐‘ข โˆˆ โ„ such that Def. ๐‘” โˆ€ ๐‘ฆ 1 , โ€ฆ , ๐‘ฆ ๐‘œ โˆˆ 0,1 ๐‘œ , ๐’ˆ ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ = ๐Ÿ โ‡” โˆ‘ ๐’‹ ๐’™ ๐’‹ ๐’š ๐’‹ โ‰ฅ ๐’– Depth-Two LTF Circuits ( ๐‘ด๐‘ผ๐‘ฎ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ ): Major problem to find โ€œniceโ€ functions without ๐‘œ ๐‘™ -gate ๐‘€๐‘ˆ๐บ โˆ˜ ๐‘€๐‘ˆ๐บ circuits, for all ๐‘™ [Hajnal et al.โ€™91] exp(n) depth-two lower bounds for small ๐‘ฅ ๐‘— โ€™s [Roychowdhury-Orlitsky- Siuโ€™94] What about โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ ? Special case of ๐‘ด๐‘ผ๐‘ฎ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ : the linear form for output LTF must always evaluate to 0 or 1 Still, no ๐’ ๐Ÿ.๐Ÿ” -gate lower bounds were known for โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ ! We prove: Thm โˆ€๐’ , โˆƒ๐’ˆ ๐’ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ Thm โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ผ๐‘ฑ๐‘ต๐‘ญ[๐’ ๐’Ž๐’‘๐’‰ โˆ— ๐’ ] without ๐’’๐’‘๐’Ž๐’›(๐’) -size โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ Note: It is a major open problem to prove โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size (unrestricted) circuits

  6. Sums of ReLUs ๐‘œ : โ„ ๐‘œ โ†’ โ„ + is a ReLU if โˆƒ ๐‘ฅ 1 , โ€ฆ ๐‘ฅ ๐‘œ , ๐‘ข โˆˆ โ„ such that Def. ๐‘” โˆ€ ๐‘ฆ 1 , โ€ฆ , ๐‘ฆ ๐‘œ โˆˆ โ„ ๐‘œ , ๐’ˆ ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ = ๐ง๐›๐ฒ(๐Ÿ, โˆ‘ ๐’‹ ๐’™ ๐’‹ ๐’š ๐’‹ + ๐’–) โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ generalizes โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ = โ€œDepth -Two Neural Nets with ReLU Activationsโ€ Very widely studied, thousands of references Several recent references [see paper] give lower bounds for some โ€œweirdโ€ ๐’ˆ: โ„ ๐‘œ โ†’ โ„ which vary sharply / sensitive No lower bounds known for discrete-domain / Boolean functions (note: โ€œmost sensitiveโ€ Boolean fn PARITY has O(n)-size โˆ‘โˆ˜ ๐‘ด๐‘ผ๐‘ฎ ) We can generalize the โˆ‘ โˆ˜ ๐‘ด๐‘ผ๐‘ฎ limits to โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ : Thm โˆ€๐’ , โˆƒ๐’ˆ ๐’ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ Thm โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ผ๐‘ฑ๐‘ต๐‘ญ[๐’ ๐’Ž๐’‘๐’‰ โˆ— ๐’ ] without ๐’’๐’‘๐’Ž๐’›(๐’) -size โˆ‘ โˆ˜ ๐‘บ๐’‡๐‘ด๐‘ฝ Again: major open problem to prove โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size (unrestricted) circuits

  7. Sums of Low-Degree GF(p)-Polys โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐’†[๐’’] : Linear combination of ๐‘”: 0,1 ๐‘œ โ†’ {0,1, โ€ฆ , ๐‘ž โˆ’ 1} where for every ๐‘” there is a degree- ๐‘’ polynomial ๐‘Ÿ(๐‘ฆ) such that โˆ€๐‘ฆ โˆˆ 0,1 ๐‘œ , ๐’ˆ ๐’š = ๐’“ ๐’š mod ๐’’ Case of ๐’† = ๐Ÿ‘, ๐’’ = ๐Ÿ‘ is already very interesting! Compelling Conjecture [โ€œDegree - Two Uncertainty Principleโ€]: ๐‘ฉ๐‘ถ๐‘ฌ (on ๐’ inputs) requires ๐’ ๐ ๐Ÿ -size โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐Ÿ‘[๐Ÿ‘] Known: ๐‘ฉ๐‘ถ๐‘ฌ requires ฮฉ(2 ๐‘œ ) -size โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐Ÿ ๐Ÿ‘ ๐‘ฉ๐‘ถ๐‘ฌ has O(2 ๐‘œ/2 ) -size โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐Ÿ‘[๐Ÿ‘] No non-trivial lower bounds were known for โˆ‘ โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐Ÿ‘[๐’’] We prove: Thm โˆ€๐’†, ๐’, โˆ€๐’’ prime, โˆƒ๐’ˆ ๐’ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐’†[๐’’] Thm โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ผ๐‘ฑ๐‘ต๐‘ญ[๐’ ๐’Ž๐’‘๐’‰ โˆ— ๐’ ] without ๐’’๐’‘๐’Ž๐’›(๐’) -size โˆ‘โˆ˜ ๐‘ธ๐‘ท๐‘ด๐’๐’†[๐’’] for all fixed ๐’† and fixed prime ๐’’

  8. A Key Theorem A new instance of โ€œ Circuit Analysis Algorithms โ‡’ Circuit Lower Bounds โ€ Key Theorem: Let ๐““ be a class of functions ๐’ˆ โˆถ ๐Ÿ, ๐Ÿ ๐’ โ†’ โ„ . Assume: there is an ๐œป > ๐Ÿ and an algorithm ๐‘ฉ so that for any given ๐’ˆ ๐Ÿ , โ€ฆ , ๐’ˆ ๐Ÿ“ โˆˆ ๐““ , ๐‘ฉ can compute the โ€œsum - productโ€ ๐Ÿ“ เท เท‘ ๐’ˆ ๐’‹ (๐’ƒ) ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ ๐’‹=๐Ÿ in ๐Ÿ‘ ๐’ ๐Ÿโˆ’๐œป time. Then: โˆ€๐’ , โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size โˆ‘โˆ˜ ๐““ , and โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ผ๐‘ฑ๐‘ต๐‘ญ ๐’ ๐’Ž๐’‘๐’‰ โˆ— ๐’ without ๐’’๐’‘๐’Ž๐’›(๐’) -size โˆ‘โˆ˜ ๐““ Applies the new Easy Witness Lemma of [Murray- Wโ€™18] We show how to compute sum-products in ๐Ÿ‘ ๐’ ๐Ÿโˆ’๐œป time for LTFs, ReLUs, and low-degree polynomials

  9. Major Ideas in the Key Theorem Assume: (1) There is a ๐Ÿ‘ ๐’ ๐Ÿโˆ’๐œป -time sum-product algorithm ๐‘ฉ for ๐““ (2) For some fixed ๐’ , all ๐’ˆ โˆˆ ๐‘ถ๐‘ธ have ๐’ ๐’ -size โˆ‘โˆ˜ ๐““ Goal: Derive a contradiction. (1) and (2) โ‡’ Given (unrestricted) circuit ๐‘ผ with ๐’ inputs and ๐’ size Can guess-and-check ๐’ ๐’ -size โˆ‘โˆ˜ ๐““ computing ๐‘ผ , in ๐Ÿ‘ ๐’ ๐Ÿโˆ’๐œป ๐’ ๐‘ท ๐Ÿ time Note: to guess, we need that the coefficients in our linear combinations have โ€œsmallโ€ bit complexity, WLOG (1) โ‡’ Can solve Circuit-UNSAT in nondeterministic ๐Ÿ‘ ๐’ ๐Ÿโˆ’๐œป ๐’ ๐‘ท ๐Ÿ time We can even solve #Circuit-SAT, because we can compute โˆ‘ ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ (โˆ‘โˆ˜ ๐““ ๐’ƒ ) = โˆ‘ โˆ‘ ๐’ƒ ๐““(๐’ƒ) by solving sum-product for ๐’ ๐’ times [Murray- Wโ€™18] โ‡’ โˆ€๐’ , โˆƒ๐’ˆ โˆˆ ๐‘ถ๐‘ธ without ๐’ ๐’ -size unrestricted circuits Contradicts (2) when โˆ‘โˆ˜ ๐““ can be simulated by Boolean circuits! The proof crucially relies on โˆ‘โˆ˜ ๐““ computing a circuit exactly

  10. Sum-Product Algorithm for LTF Uses (old) fact that #Subset-Sum is solvable in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time! Thm [HSโ€™76] #Subset-Sum on ๐’ numbers is in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time Proof Given ๐’™ ๐Ÿ , โ€ฆ , ๐’™ ๐’ , ๐’– , we want to know the number of ๐‘ป โŠ† [๐’] such that โˆ‘ ๐’‹โˆˆ๐‘ป ๐’™ ๐’‹ = ๐’– 1. Enumerate all possible ๐Ÿ‘ ๐’/๐Ÿ‘ subsets ๐‘ป of {๐’™ ๐Ÿ , โ€ฆ , ๐’™ ๐’/๐Ÿ‘ } . Make a list ๐‘ด ๐Ÿ of the ๐Ÿ‘ ๐’/๐Ÿ‘ subset sums, and SORT all sums in ๐‘ด ๐Ÿ 2. Enumerate all possible ๐Ÿ‘ ๐’/๐Ÿ‘ subsets ๐‘ผ of {๐’™ ๐’/๐Ÿ‘+๐Ÿ , โ€ฆ , ๐’™ ๐’ } . For each ๐‘ผ summing to a value ๐’˜ , BINARY SEARCH for a value ๐’˜โ€ฒ in ๐‘ด ๐Ÿ such that ๐’˜ + ๐’˜โ€ฒ = ๐’– 3. To compute the total number of subsets summing to ๐’– : For each sum value ๐’˜โ€ฒ appearing in ๐‘ด ๐Ÿ , store the number ๐’ ๐’˜โ€ฒ of subsets in ๐‘ด ๐Ÿ which have value ๐’˜โ€ฒ . Later, if value ๐’˜โ€ฒ is found in the binary search, add ๐’ ๐’˜โ€ฒ to a running sum. Takes ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time in total

  11. Sum-Product Algorithm for LTF Uses (old) fact that #Subset-Sum is solvable in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time! Thm For any ๐’ˆ ๐Ÿ , โ€ฆ , ๐’ˆ ๐Ÿ“ โˆˆ ๐‘ด๐‘ผ๐‘ฎ , we can compute ๐Ÿ“ in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time. เท เท‘ ๐’ˆ ๐’‹ (๐’ƒ) ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ ๐’‹=๐Ÿ Proof An Exact LTF ( ๐‘ญ๐‘ด๐‘ผ๐‘ฎ ) has the form ๐’‰ ๐’š = ๐Ÿ โ‡” โˆ‘ ๐’‹ ๐’™ ๐’‹ ๐’š ๐’‹ = ๐’– #Subset-Sum in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time โ‡’ โˆ‘ ๐‘ ๐‘• ๐‘ in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time [HP, CCCโ€™10]: Every ๐‘ด๐‘ผ๐‘ฎ on ๐’ inputs can be written as โˆ‘ ๐’’๐’‘๐’Ž๐’› ๐’ ๐‘ญ๐‘ด๐‘ผ๐‘ฎ ๐Ÿ“ ๐Ÿ“ for ๐‘ญ๐‘ด๐‘ผ๐‘ฎ s ๐’‰ ๐’‹,๐’Œ So we can write เท เท‘ ๐’ˆ ๐’‹ (๐’ƒ) = เท เท‘ เท ๐’‰ ๐’‹,๐’Œ (๐’ƒ) ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ ๐’‹=๐Ÿ ๐’‹=๐Ÿ ๐’’๐’‘๐’Ž๐’› ๐’ ๐Ÿ“ ๐Ÿ“ Simple algebra: = เท เท เท‘ ๐’‰ ๐’‹,๐’Œโ€ฒ ๐’ƒ = เท เท เท‘ ๐’‰ ๐’‹,๐’Œโ€ฒ ๐’ƒ ๐’ƒโˆˆ ๐Ÿ,๐Ÿ ๐’ ๐’ƒโˆˆ{๐Ÿ,๐Ÿ} ๐’ ๐’’๐’‘๐’Ž๐’› ๐’ ๐’‹=๐Ÿ ๐’’๐’‘๐’Ž๐’› ๐’ ๐’‹=๐Ÿ Can compute in ๐’’๐’‘๐’Ž๐’› ๐’ โ‹… ๐Ÿ‘ ๐’/๐Ÿ‘ time! ๐Ÿ“ Each ฯ‚ ๐’‹=๐Ÿ ๐’‰ ๐’‹,๐’Œโ€ฒ ๐’š = ๐’Š ๐’š for some ๐‘ญ๐‘ด๐‘ผ๐‘ฎ ๐’Š

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend