Ifip fu ptpensdx HUAI Dcp 1105 Here VCH fflnryskcx.yst - - PDF document

ifip fu
SMART_READER_LITE
LIVE PREVIEW

Ifip fu ptpensdx HUAI Dcp 1105 Here VCH fflnryskcx.yst - - PDF document

Kernetroed Stem Discrepancy Descent Stern Variational Gradient for Ecp Sph Cfp Dx Doe 27th tell Vibe tf Reus V CPS 154 e U minimizes FptpYD sef dncp.lk vs c For such E Ifip fu ptpensdx HUAI Dcp 1105 Here VCH fflnryskcx.yst


slide-1
SLIDE 1

Kernetroed Stem Discrepancy

Stern Variational Gradient

Descent

Doe 27th

for Ecp

SphCfpDx

Reus

tellVibe tf

V CPS154

e U

minimizes

sef dncp.lk

FptpYD

For such E

c vs

Ifip fu

ptpensdx HUAI

Dcp1105

Here VCHfflnryskcx.yst

vykcx.ysdpcysdcpnrk

qfff.TT bestEpl ref's

11811ns

A

slide-2
SLIDE 2

ftp.i 71nr.cf

t T ol

Ki

Stah Discrepancy

A

KSD

for

Goodness of fit Tests

Liu

Lee Jordan

ICML 16

Coodness of

fit

Test

Measurdy

how well do the observed

data

correspond

to

the fitted model

V

Ho

M

V

µ

an

i

Hi

Min

Xiu

Xu

pl

U known

slide-3
SLIDE 3

Two sample test

i

4

mom

likelihood based

approaches Need to

compete likelihoods

CDF

hidden variable models

generative models X

L

MCMC

variational methods

error large

hard

to

estimate Curse of dimensionality

Traditional

methods

Xf

Kolmogorov

Smirnov etc

Today

Evaluating

KSD

A

likelihood free

approach

slide-4
SLIDE 4

with guaranteed

stated

significance

i

Preliminaries

definitionof KSD

ii

Properties of

KSD

Iii

Onighd SWD approach

Civ

Open problems i

s

Stem's

method

C 1970s

for

distributional

approximation

probability

metrics MMD

duqu.is

sina.ge fhdm fhdvt

Test function

farmty Hi

Moh enough

slide-5
SLIDE 5

Lg saz

Lip

LE

E Bord A

dK

dw

DT V

Q

Gruen XIII

independent variables

How to bound

dselfnzxi.NL

  • rD

CLT

important avg rate

Math idea

Replace the

charatenisth

f

n typically used to show cvg in

11

11

distribution wth

a

characterrolyoperator

Dehhe A

CCR

CCR

by

A f f

xp

Lee

I

be the cdf of Nc orb

then

7

fx solves

f

xf

Iq

ang ILD

slide-6
SLIDE 6

for

Vfraedx.GR

CODE

result

Seem

lemma

A

r v

W

Neo

1

tf

a c f

wth ElpCWH

N

E Aflw

Cor 2

For A

r r

W

I PCW

x

Echl

Elf'xcw

WfxlWDl

dk

Generalseagidae

Cnn

qu.ge fhdpe fhdnI

For ghen heh

let fh solves

flew

whew

how

Ich

where

El h

Echl 2D

for

zu Nco 1

Then dad

pent

qq.ge EfkcwI wfhcwy

slide-7
SLIDE 7

for wadge

Stern's Normal Approximation

Once

we solve

and restrict

  • ur discuss

Ron

  • n F

If I

11

Kf Has 2 Hf'll

E FZ

we have

duc WIE EFFIEflew

Wfew f

Cor 2 for

O

mean independent

Xi in

W

Exist

we have

dwcw.zje

nz 2 L lxiptTEFEX.ie

Generating

N Ross

What if

we

are

approximately

a

Probability distribution

with smooth density

q

Replace

x by eye

slide-8
SLIDE 8

On

X

E Phd

2 smooth densities

g q

are identical

EptAqtfD

EpCSqcxsfcxse0xfcxD

O.H

are

Sg

Tx Ingen

is the

Steth

score function

linear operator A q

is called the

Stern's operator

acts

  • n

the

Stem class of q

i

e f C C'CX

Tx

Tx fax poxDdx

For f

Efi FIL Agf epded

slide-9
SLIDE 9

Gorham

Mackey 15

Sepp

gy.ffEHAe.SE

  • r

requires

a

doffrade variational optimization

So

RK.MS

Settings

Hd

Jex Il Hnk

RKHS

for

vector

valued functions

k

is

strictly p d Then

in

Ltu LeeJordan

Definition

For p

f

C PCRfd

KsPfp q

Sep g

slide-10
SLIDE 10

Exxinglese spfcxskcx.x.scSe Spex's

w

r

f

Scoredifference gd

p

g Sp

C ICH

RSD g 97

g

q

ii

Characterfzadond properties

Them3.61

Scp g

Ex a pcuecx.io'D

with kernel ugxix

Sgcx5kcxx Sew

t

sqcxJFxikcxixbe yxkexix5sqcx.lt ere Then kexxD

Not symmetric

w

r e

Cp g

A special

kernel Red

MMD

later

Proof1

Notice EpcAet

Eglise sp 15

i

e EplerCAet

EpCee SHH

slide-11
SLIDE 11

Apply it

  • n

kex

cfxetxy.me

have

scp.gs

Earing

e Sphinx's

Apply it again for fixed

x

pg

Them 3 81

For

Pan's

EKAqkx.cm

2

2

Scg

Hpged

ymeagedEpercAqfD

Proof2

Hyundai

Scg g

FaxingUsg

sgcxDTkcx.xblsqcxg

sgcxDEE.mil

xs fgcxYkkcxi3 kc

x's

c

CExksi sobkca.D.Ex.kc.MG'D

slide-12
SLIDE 12

npiked same TACK

as

above

White

tip

zed

Cfi Ea

A

kex Dae

Cf Eagle

kcxistqkcxi.DZ

y Emp sqkxkfi.kcxisbe cfr.tqkcxi.be

  • f Oxfam Cfc

Oxkex Doe

ExngESq4xHvxst7xifvxD

Eg

rLAqtD.w

trek

flushes the proof

D

slide-13
SLIDE 13

We

are

in

a

situation

to give

an

estimation of Sep g

From

Thru3 6

U statistics

Qc pig

Ej

Uecxing

can

be used

in applicationcBeoIgtrapg

The

connection

with

Fisher divergence

MMD

1

Fisher

ftp.qi

Epl Sp Sel

KS D

is

a

kerndraed Fisher divergence

ifeng.FIetqscp.pe

k.xIFcpe

slide-14
SLIDE 14

HsgSplbe

f

Eg ercaqtif E.pk se Spi't

for

Nflbed El

2

MMD

Kemelbed version

EfefEgf Egf

Hfbee 13

Ex

kcxixsekcy.ys zkcx.gs

y y

E

Cureton 10

Notice that

the

Ey y ng UstY Y'I 2UqCx y

e 0

from

stem

identity

Thus RSD is

a MMD

wth

asymmetric kernel Uq

MMD

2

sample

test

KSD

Goodness of FM

test

slide-15
SLIDE 15

Some

asymptotic

results

A

For ptg asymtotically

normal

Jn

Icp q

Scpq1

d Nco of

where tu

Vern

EmpUgc Xi X

For f

q

  • of
  • insipid

d

95955

i a

Gaussian

edgen

All standard results

  • f

V

stat

Me

slide-16
SLIDE 16

Luk Lu

A Universal Approximation Thm

2020

  • f DNN for expressing distributions

MMDCp.at Eggs I Ept

Eat 1

O

KSDlp.akfffg.ge Ep

Tha f

F f

O

Thru 4 i

Xi

na

R

EE Sa Then there

exBfs

realizations of Pn

sie

foMonty

Arequelites hold

i

for Nefstadt

Cq

D

W Cpn a

E

Cigs

d

2

c

n ta

d

3

for pd

bounded k

pumpkin

1

me

slide-17
SLIDE 17

for k Las bdd derivatives

sub Gaussiansmooth a

wth

Thia

L Lip

KS DC Pn

a

E

CFdn Or you

can

understand above results

as

with prob

at

least I S holds for

C

as

independent of

n

Proof

l

127 Well known Sketch of

2

Here

K MMD Pma

llfkcxisd.tk

a the

if X ie Xn

Sriperumbudur

  • r

surveya

Y satisfies 14CXin Xi

Xn

YCx

Xi

xD

2nT

sMk

Then from

McDiarmid's Ineq

with prob

I

e

e Hoeffdlytype

Hf ke

x dik

a Hae

E

t

slide-18
SLIDE 18

By standard symmetrization argument

EHfkiixsdfnastbe ZEEH

TEeikc.msbe

Xi GB

Rademacher averages

i id

Ci

t 1

Use McDiarmid again

RHS

E EeHenzeke iXiHµF

e

Eek

a mix isthe Fits

pg

3

Sketch

KSDTPn.at

nEjuacxi

xj7

Use Bernstein type Meg

I

for

V

statistics

symmetric kernel

check this kernel

Ua satisfies

the

conditions of CThm C D

degenerate

Ikalx.gs Egcxsgcy7iEfgcxa4es2Jkki 2

D

pl l KS Dist

a C exp f GEE

slide-19
SLIDE 19

A

4

iii

Original SVCD

e Liu

Wang

t 6

NeurIPs

Variational Inference Problem

Find

q

angmish E K LCamp

Ee

a simpler distribution

Set

qq.CZ

be the density of

z

Tex qq.gl Z

gifted

I Dee Fettes 1

Previous

methods consider T with certain

parametric form then

  • ffshoreparameters

In Efhm 3 t

they noticed

that if

Directly

applying gradient descent to solve

above problem

For

X

q Eep dehked above

ddt Kneer HD Eder joy

t

E o

CH

TITLE ful Oe

e Enna dx

slide-20
SLIDE 20

f

a

areas

Agarh they

were able to minimizeCH

in

Hotbeds

p

the dhectron of steepest descent

is

VCD

Eyuglkcx.yoylnfytoykcx.gs

i

e

for

this

v

Zege

gratzekkfeHp

4

The

metric Wu will

be discussed

later

Advantages

Seen

Geometric structure

C 17

so'D

Worry i On the geometry of SVED

Duncan etat

if I

we need

Under which condition

to

consider this gradient flow

  • r

use

slide-21
SLIDE 21

Stem distance between measures What

can

we

benefit from monty portholes smoothly

Can

we find

some

alternate

apostate flows

to improve SVAD

wire

Cvg

rate

i

  • utliers

Understand the bias and variance

  • f

SVGD particles

  • r

combine it

with

traditional MC

E from the 17

I

Nathan Ross

Fundamentals of

Scali

And

slide-22
SLIDE 22

Steele's method

I

Borisov

Approximation of distr

  • f

V

store

with

multidimensional kernels

Thank you