Consistent Kernel Mean Estimation for Functions of Random Variables - PowerPoint PPT Presentation

Consistent Kernel Mean Estimation for Functions of Random Variables Ilya Tolstikhin jointly with C.-J. Simon-Gabriel, A. ` Scibior, and B. Sch¨ olkopf (NIPS 2016) Dagstuhl December 2016

Motivation Given: ◮ Independent random variables X ∈ X and Y ∈ Y ; ◮ i.i.d. samples { X i } N i =1 and { Y j } N j =1 ; ◮ Any function f : X × Y → Z . Construct a flexible representation for the distribution of Z = f ( X, Y ) . Let’s represent distributions using their mean embeddings. The simplest estimator is: √ := 1 � N µ (1) � � ˆ f ( X i , Y i ) , · N − consistent i =1 k Z . Z N Experiments show that the U-statistic estimator performs better: √ 1 � N µ (2) � � ˆ := i,j =1 k Z f ( X i , Y j ) , · . N − consistent Z N 2

Motivation Experiments show that the U-statistic estimator performs better: √ 1 � N µ (2) � � ˆ := i,j =1 k Z f ( X i , Y j ) , · . N − consistent Z N 2 Unfortunately, N 2 may be computationally prohibitive. Sch¨ olkopf et. al (2015) : take n ≪ N and use reduced set methods to � N i =1 k ( X i , · ) ≈ � n 1 i =1 w i k ( X ′ 1. Approximate i , · ) ; N � N j =1 k ( Y j , · ) ≈ � n 1 j =1 v j k ( Y ′ 2. Approximate j , · ) ; N 3. Use the following estimator: n � f ( X ′ i , Y ′ � � µ Z := ˆ w i v j k Z j ) , · . i,j =1 Question: is ˆ µ Z consistent?

New results Answer: yes, ˆ µ Z is indeed consistent. Assume: Proof based on [SS16] ◮ X and Z are compact; ◮ f : X → Z is continuous; ◮ k X , k Z are continuous p.d. kernels on X and Z ; ◮ k X is c 0 -universal; ◮ There exists C s.t. � i | w i | ≤ C independently of n . Then: N N � � � � w i k X ( X i , · ) → µ X ⇒ w i k Z f ( X i ) , · → µ Z . H k X H k Z i =1 i =1 ◮ Importantly, w 1 , . . . , w N and X 1 , . . . , X N can be interdependent. ◮ Finite sample guarantees for X = R d , Z = R d ′ and Mat´ ern kernels. ◮ Applications: probabilistic programming, privacy-preserving ML, . . .

Related results. . . ◮ Minimax Estimation of Kernel Mean Embeddings T., Sriperumbudur, Muandet, 2016, arXiv Task: � X k ( x, · ) dP ( x ) based on the i.i.d. sample { X i } N Estimate i =1 Result: for translation-invariant kernels you can not do it faster than N − 1 / 2 . ◮ Minimax Estimation of MMD with Radial Kernels T., Sriperumbudur, Sch¨ olkopf, 2016, NIPS Task: Estimate � µ P − µ Q � H k based on i.i.d. samples { X i } N i =1 and { Y i } M i =1 Result: for radial kernels you can not do it faster than N − 1 / 2 + M − 1 / 2 .

Consistent Kernel Mean Estimation for Functions of Random Variables - PowerPoint PPT Presentation

Consistent Kernel Mean Estimation for Functions of Random Variables Ilya Tolstikhin jointly with C.-J. Simon-Gabriel, A. ` Scibior, and B. Sch olkopf (NIPS 2016) Dagstuhl December 2016 Motivation Given: Independent random variables X

Feasibility of Consistent, Feasibility of Consistent, Feasibility of Consistent, Feasibility of

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Estimation of Theoretically Consistent Stochastic Frontier Functions in R Arne Henningsen

CSS Modules with BEM Consistent Design Consistent Design Different Module Versions Consistent

General Structure of a PW code Self-Consistent KS eqs. or Global Minimization approach

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Estimation II: Consistency Stat 3202 @ OSU, Autumn 2018 Dalpiaz 1 The Usual Setup Suppose we

Mining the social web: A series of statistical NLP case studies Vasileios Lampos Department of

Governance of social services in the Czech Republic in a comparative perspective (Crucial trends

Agent-Based Systems Discussed simple, abstract models of multiagent encounters Utilities,

Lecture 15: Batch RL Emma Brunskill CS234 Reinforcement Learning. Winter 2019 Slides drawn from

Lo Locally Differentially Private Frequency Es Esti timati tion on Ex Exploi oiti ting Con

Estimation: Sample Averages, Bias, and Concentration Inequalities CMPUT 296: Basics of Machine

Point Estimation The goal of Point Estimation is to find the point in -space which gives the