Sample Amplification: Increasing Dataset Size even when Learning is - PowerPoint PPT Presentation

Sample Amplification: Increasing Dataset Size even when Learning is Impossible Brian Axelrod Shivam Garg Greg Valiant Vatsal Sharan

0 0 0 2 , 4 3 $ o r f r s u Y o What does it mean that a GAN made this image? (Does it mean that GANs “know” the distribution of renaissance portraits?)

When can you make more data? Could you generate new samples from a distribution, without even `` learning’’ it?

New Problem: Sample Amplification Amplifier Input: n i.i.d. samples from D Output: m > n “samples” Output: ACCEPT or REJECT Verifier Input: m samples, distribution D Promise: If input is m i.i.d. draws from D, then w. prob > ¾, must ACCEPT. Verifier: 1. Knows D 2. Is computationally unbounded 3. Does not know training set

Sample Amplification Definition: A class of distributions C admits ( n , m )-amplification, if there is an (n,m) Amplifer s.t. for all D ∈ C, any Verifier will ACCEPT with prob > 2/3. Verifier: knows D , is computationally unbounded

Sample Amplification Definition: A class of distributions C admits ( n , m )-amplification, if there is an (n,m) Amplifer s.t. for all D ∈ C, any Verifier will ACCEPT with prob > 2/3. • Every class C admits (n,n)-amplification (why?) • Verifier does not see Amplifier’s n input samples. (Otherwise equivalent to learning ) • Up to constant factors, equivalent to asking whether Amplifier can output m samples, whose T.V. distance to m i.i.d. samples from D is small.

Sample Amplification Definition: A class of distributions C admits ( n , m )-amplification, if there is an (n,m) Amplifer s.t. for all D ∈ C, any Verifier will ACCEPT with prob > 2/3. Connection to GANs: Amplifier -> Generator, Verifier -> Discriminator? Not quite.. Similarities in how samples are used and evaluated.

RESULTS

Sample Amplification Thm 1: Let C be class of discrete distributions supported on ≤ k elements. ( n, n + n/sqrt(k) )-amplification is possible (and optimal, to constant factors) * Nontrivial amplification possible as soon as n > sqrt(k). * Learning to nontrivial accuracy requires n= 𝜄 (k) samples * Even with n >> k can never amplify by arbitrary amount. Thm 2: Let C be class of Gaussians in d dimensions, with fixed covariance (e.g. “isotropic”), and unknown mean. ( n, n + n/sqrt(d) )-amplification is possible (and optimal, to constant factors) * Nontrivial amplification possible as soon as n > sqrt(d). * Learning to nontrivial accuracy requires n= 𝜄 (d) samples

GAUSSIAN DISTRIBUTION

Thm 2: For Gaussians in d dimensions, with fixed covariance, and unknown mean: • Learning requires n = d. • Amplification possible starting at n = sqrt(d) . ( n, n + n/sqrt(d) )-amplification is possible (and optimal, to constant factors) Algorithm: 1) Draw x n+1 …x m using empirical mean u* of input samples. 2) For each input sample x i “decorrelate” it from u*. 3) Return x n+1 …x m along with “decorrelated” original samples. Thm 3: If output ⊃ input samples, require n > d/ log d for nontrivial amp. Intuitively, issue is new “samples” would be too correlated with originals:

IS AMPLIFICATION USEFUL?

Amplification does not add new information, but could make original information more easily accessible. Can widely used statistical tools do better on amplified samples? Statistical Data estimator

Amplification does not add new information, but could make original information more easily accessible. Can widely used statistical tools do better on amplified samples? Statistical Amplifier Data Amplified Data estimator

Amplification Maybe Useful? Given examples (𝑦, 𝑧)~𝐸 estimate error of best linear model Standard unbiased estimator: Error of least-squares model, scaled down 𝑦~ 𝐻𝑏𝑣𝑡𝑡𝑗𝑏𝑜(𝑒 = 50), 𝑧 = 𝜄 8 𝑦 + 𝐻𝑏𝑣𝑡𝑡𝑗𝑏𝑜 𝑜𝑝𝑗𝑡𝑓 Error of classical estimator vs. same estimator on (𝑜, 𝑜 + 2) amplified samples.

Amplification Maybe Useful? Statistical Amplifier Data Amplified Data estimator

FUTURE DIRECTIONS

What property of a class of distributions determines threshold at which non-trivial amplification is possible? More general amplification schemes? MORE powerful How much does Verifier need to know about n input samples to preclude amplification without learning? Verifier? [How much do we need to know about a GAN’s input, to evaluate its output?] l u f r e w o p S S E ? What if Verifier doesn’t know D, only gets sample access? L r e i f i r e V

Amplifier THANK YOU!

Sample Amplification: Increasing Dataset Size even when Learning is - PowerPoint PPT Presentation

Sample Amplification: Increasing Dataset Size even when Learning is Impossible Brian Axelrod Shivam Garg Greg Valiant Vatsal Sharan 0 0 0 2 , 4 3 $ o r f r s u Y o What does it mean that a GAN made this image? (Does it mean

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Lehrstuhl fr Systemsicherheit Amplification DDoS Attacks Marc Khrer SPRING 9 Bochum, 31.

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Magnetic Field Amplification in SNR by Richtmyer-Meshkov Instability K. Nishihara, T. Sano

MAST BOLOGNA, 25-26 OCTOBER, 2016 DEPArray User Meeting HER2 expression and amplification

Privacy Amplification by Mixing and Diffusion Mechanisms Borja Balle, Gilles Barthe, Marco

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Combating DNS amplification attacks using Cookies Supervisor: Roland van Rijswijk SURFnet By:

Pre-amplification critically analysed Jo Vandesompele professor, Ghent University co-founder and

Data Amplification: Instance-Optimal Property Estimation Yi Hao and Alon Orlitsky {yih179, alon}@

Randomness in Computing L ECTURE 3 Last time Probability amplification Verifying matrix

Probabilistic Computation Lecture 15 Computing with Less Randomness, or with Imperfect

Sample Preparation Sample Preparation Sample Size 6 mm x 12 mm x 50 mm 10 mm x 12 mm

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

Math 1710 Class 24 Examples Power 2-Sample CIs Dr. Allen Back and HTs 2-Sample

Single Sample t Test Assessment of differences between groups t comp & t crit t crit

1 Tyrosine Tyrosine TH [ 18 F] FDOPA DOPA DDC Amphetamine Reserpin DA Tetrabenazine DAT [

ADHD PHARMACOLOGY University of Hawaii Hilo Pre -Nursing Program NURS 203 General

Preliminary data show 938 confirmed and estimated opioid-related overdose deaths within the first

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 ,

Mitigating Gender Bias Amplification in Distribution by Posterior Regularization Shengyu Jia *

Ring Amplifiers for Switched Capacitor Circuits Benjamin Hershberg 1 , Skyler Weaver 1 , Kazuki

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly

Sambuz

Useful Links

Newsletter

Mail Us

Sample Amplification: Increasing Dataset Size even when Learning is - PowerPoint PPT Presentation

Sample Amplification: Increasing Dataset Size even when Learning is Impossible Brian Axelrod Shivam Garg Greg Valiant Vatsal Sharan 0 0 0 2 , 4 3 $ o r f r s u Y o What does it mean that a GAN made this image? (Does it mean

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Lehrstuhl fr Systemsicherheit Amplification DDoS Attacks Marc Khrer SPRING 9 Bochum, 31.

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Magnetic Field Amplification in SNR by Richtmyer-Meshkov Instability K. Nishihara, T. Sano

MAST BOLOGNA, 25-26 OCTOBER, 2016 DEPArray User Meeting HER2 expression and amplification

Privacy Amplification by Mixing and Diffusion Mechanisms Borja Balle, Gilles Barthe, Marco

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Combating DNS amplification attacks using Cookies Supervisor: Roland van Rijswijk SURFnet By:

Pre-amplification critically analysed Jo Vandesompele professor, Ghent University co-founder and

Data Amplification: Instance-Optimal Property Estimation Yi Hao and Alon Orlitsky {yih179, alon}@

Randomness in Computing L ECTURE 3 Last time Probability amplification Verifying matrix

Probabilistic Computation Lecture 15 Computing with Less Randomness, or with Imperfect

Sample Preparation Sample Preparation Sample Size 6 mm x 12 mm x 50 mm 10 mm x 12 mm

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

Math 1710 Class 24 Examples Power 2-Sample CIs Dr. Allen Back and HTs 2-Sample

Single Sample t Test Assessment of differences between groups t comp &amp; t crit t crit

1 Tyrosine Tyrosine TH [ 18 F] FDOPA DOPA DDC Amphetamine Reserpin DA Tetrabenazine DAT [

ADHD PHARMACOLOGY University of Hawaii Hilo Pre -Nursing Program NURS 203 General

Preliminary data show 938 confirmed and estimated opioid-related overdose deaths within the first

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 ,

Mitigating Gender Bias Amplification in Distribution by Posterior Regularization Shengyu Jia *

Ring Amplifiers for Switched Capacitor Circuits Benjamin Hershberg 1 , Skyler Weaver 1 , Kazuki

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly

Sambuz

Useful Links

Newsletter

Mail Us

Single Sample t Test Assessment of differences between groups t comp & t crit t crit