stein s method for diffusion approximation
play

Steins method for diffusion approximation Thomas Bonis DataShape - PowerPoint PPT Presentation

Steins method for diffusion approximation Thomas Bonis DataShape team, Inria K-nearest-neighbor graph We draw n points in R d , X 1 , . . . , X n d = fd K-nearest-neighbor graph We draw n points in R d , X 1 , . . . , X n d =


  1. Stein’s method for diffusion approximation Thomas Bonis DataShape team, Inria

  2. K-nearest-neighbor graph We draw n points in R d , X 1 , . . . , X n ∼ dν = fdλ

  3. K-nearest-neighbor graph We draw n points in R d , X 1 , . . . , X n ∼ dν = fdλ We add an edge between each point and its K-nearest-neighbors

  4. K-nearest-neighbor graph We draw n points in R d , X 1 , . . . , X n ∼ dν = fdλ We add an edge between each point and its K-nearest-neighbors When K, n → ∞ , K/n → 0 (+other condition on K ), can we recover f using only the graph structure?

  5. Random walk on K-nearest-neighbor graph A random walk on the graph captures information on ν .

  6. Random walk on K-nearest-neighbor graph A random walk on the graph captures information on ν . The random walk approximates the diffusion process with generator f − 2 /d ( ∇ (log f ) . ∇ + 1 2∆) .

  7. Random walk on K-nearest-neighbor graph A random walk on the graph captures information on ν . The random walk approximates the diffusion process with generator f − 2 /d ( ∇ (log f ) . ∇ + 1 2∆) . Diffusion has invariant measure µ with density proportional to f 2+2 /d .

  8. Random walk on K-nearest-neighbor graph A random walk on the graph captures information on ν . The random walk approximates the diffusion process with generator f − 2 /d ( ∇ (log f ) . ∇ + 1 2∆) . Diffusion has invariant measure µ with density proportional to f 2+2 /d . Does π , the invariant measure of the random walk, converge to µ ?

  9. Random walk on ǫ -graph Edge between X i and X j if � X i − X j � ≤ ǫ . The random walk approximates the diffusion process with generator ∇ (log f ) . ∇ + 1 2∆ . Diffusion has invariant measure µ with density proportional to f 2 . π ( X i ) proportional to the degree of X i (the ball density estimator → f ). Since we have more points where f is large, π converges to a measure with density proportional to f 2 .

  10. Stein discrepancy Let γ be the gaussian measure Z be drawn from γ . Then, ∀ φ , E [ − Z. ∇ φ ( Z ) + ∆ φ ( Z )] = 0 .

  11. Stein discrepancy Let γ be the gaussian measure Z be drawn from γ . Then, ∀ φ , E [ − Z. ∇ φ ( Z ) + ∆ φ ( Z )] = 0 . Let X be drawn from ν . We say that a measure ν admits a Stein kernel τ ν with respect to γ if there exists τ ν such tha, ∀ φ , E [ − X. ∇ φ ( X )+ < τ ν ( X ) , Hess ( φ )( X )) > HS ] = 0 .

  12. Stein discrepancy Let γ be the gaussian measure Z be drawn from γ . Then, ∀ φ , E [ − Z. ∇ φ ( Z ) + ∆ φ ( Z )] = 0 . Let X be drawn from ν . We say that a measure ν admits a Stein kernel τ ν with respect to γ if there exists τ ν such tha, ∀ φ , E [ − X. ∇ φ ( X )+ < τ ν ( X ) , Hess ( φ )( X )) > HS ] = 0 . Intuitively, if τ ν is close to I d then ν is close to γ . The distance between τ ν and I d is quantified by: S ( ν, µ ) 2 = E [ � τ ν ( X ) − I d � 2 ] .

  13. Bounding the Wasserstein distance with S

  14. Bounding the Wasserstein distance with S Theorem [Ledoux, Nourdin, Peccati 2015] Let ν be a measure admitting a Stein kernel τ ν and let S ( ν ) be the associated Stein discrepancy. We have: W 2 ( ν, γ ) ≤ S ( ν )

  15. Bounding the Wasserstein distance with S Theorem [Ledoux, Nourdin, Peccati 2015] Let ν be a measure admitting a Stein kernel τ ν and let S ( ν ) be the associated Stein discrepancy. We have: W 2 ( ν, γ ) ≤ S ( ν ) Problem: in the general case, discrete measures do not admit a Stein kernel. Example: if the Rademacher measure admited a Stein kernel, there would exist τ such that for any smooth function φ φ ′ ( − 1) − φ ′ (1) + τ (1) φ ′′ (1) + τ ( − 1) φ ′′ ( − 1) = 0 . Can be dealt with using a smoothing procedure (relying on the zero-bias distribution), but not practical in high dimensions.

  16. Generalizing the Stein kernel X ∼ ν . There exists an operator L such that ∀ φ, E [ L φ ( X )] = 0 , compare L with − x. ∇ + ∆ .

  17. Generalizing the Stein kernel X ∼ ν . There exists an operator L such that ∀ φ, E [ L φ ( X )] = 0 , compare L with − x. ∇ + ∆ . Let X and X ′ be drawn from ν , then ∀ φ , E [ φ ( X ) − φ ( X ′ )] = 0 , and by a Taylor-Expansion ∞ E [ X ′ k ] � φ ( k ) ] = 0 . E [ k ! k =0

  18. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ 0 � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2

  19. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ Rescaling 0 � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2

  20. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , First moment close to − X . � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ Rescaling 0 � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2

  21. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , First moment close to − X . � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ Rescaling 0 Second moment close to 1 . � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2

  22. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , First moment close to − X . � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ Rescaling 0 Second moment close to 1 . � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2 Higher moments small, start at 0 and grow as t increases. √ Roughly, Y t has to be bounded by t .

  23. Another bound on W 2 Theorem [Dimension 1] Let ν be a measure of R and r.v. X , ( X t ) t ≥ 0 drawn from ν . Let Y t = X t − X , for any h > 0 , � ∞ e − t E [( 1 h E [ Y t | X ] + X ) 2 ] 1 / 2 dt W 2 ( ν, γ ) ≤ 0 � ∞ e − 2 t E [ Y 2 1 − e − 2 t E [( 1 t | X ] − 1) 2 ] 1 / 2 dt √ + 2 h 0 � ∞ e − kt k !(1 − e − 2 t ) ( k − 1) / 2 E [ E [ Y k t | X ] 2 ] 1 / 2 dt � + √ h 0 k> 2 Similar result (in dimension 1 only !) for W p , p ≥ 1 .

  24. Another bound on W 2 Let µ be the invariant measure of an operator L = b. ∇ + < a, ∇ 2 > . Under technical conditions on L (a curvature dimension inequality), a similar result holds.

  25. Another bound on W 2 Let µ be the invariant measure of an operator L = b. ∇ + < a, ∇ 2 > . Under technical conditions on L (a curvature dimension inequality), a similar result holds. A similar result holds under technical conditions on L (for example under a curvature dimension inequality). If • E [ Y t | X ] is close to b ( X ) . • E [ Y 2 t | X ] is close to a ( X ) . • E [ Y k t ] are small for k > 2 . then W 2 ( ν, µ ) is small.

  26. Convergence rates in the Central Limit Theorem Consider i.i.d random variables X 1 , . . . , X n with measure ν and E [ X 1 ] = 0 , E [ X 2 1 ] = 1 . The Central Limit Theorem gives n S n = n − 1 / 2 � → N (0 , 1) . i =1 How fast does it converge?

  27. Convergence rates in the Central Limit Theorem Consider i.i.d random variables X 1 , . . . , X n with measure ν and E [ X 1 ] = 0 , E [ X 2 1 ] = 1 . The Central Limit Theorem gives n S n = n − 1 / 2 � → N (0 , 1) . i =1 How fast does it converge? Let ˜ X 1 , . . . , ˜ X n i.i.d. copies of X 1 , . . . , X n and I a uniform r.v. on 1 , . . . , n . We pose ( S n ) t = S n + n − 1 / 2 ( ˜ X I − X I ) .

  28. Convergence rates in the Central Limit Theorem Consider i.i.d random variables X 1 , . . . , X n with measure ν and E [ X 1 ] = 0 , E [ X 2 1 ] = 1 . The Central Limit Theorem gives n S n = n − 1 / 2 � → N (0 , 1) . i =1 How fast does it converge? Let ˜ X 1 , . . . , ˜ X n i.i.d. copies of X 1 , . . . , X n and I a uniform r.v. on 1 , . . . , n . We pose ( S n ) t = S n + n − 1 / 2 ( ˜ X I − X I )1 X i , ˜ √ √ tn ] . X i ∈ [ − tn,

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend