Adaptive Antithetic Sampling for Variance Reduction
Hongyu Ren*, Shengjia Zhao*, Stefano Ermon
*equal contribution
Adaptive Antithetic Sampling for Variance Reduction Hongyu Ren*, - - PowerPoint PPT Presentation
Adaptive Antithetic Sampling for Variance Reduction Hongyu Ren*, Shengjia Zhao*, Stefano Ermon *equal contribution Goal Estimation of = () [ ] is ubiquitous in machine learning problems. Environment
Hongyu Ren*, Shengjia Zhao*, Stefano Ermon
*equal contribution
π½π(π) ΰ·
π’
π (π‘π’, ππ’) Reinforcement Learning π½π π¦ π½π π¨|π¦ log π(π¦, π¨) π(π¨|π¦) Variational Autoencoder π½π(π¦) log πΈ(π¦) + π½π π¨ log 1 β πΈ π» π¨ Generative Adversarial Nets
π π¦ π(π¨|π¦) π(π¨) π(π¨) π(π¦|π¨) π π¦ π π¦ Real/Fake πΈ π» π(π¨) Environment Agent State Reward Action
1 2 (π π¦1 + π(π¦2))
i.i.d.
1 2 (π π¦1 + π(π¦2)) = π
1 2 (π π¦1 + π(π¦2))
i.i.d.
π¦1 π¦2 π¦1 π¦2 π π¦1 π π¦2 Marginal
Marginal
π = π¦3
Varπ(π¦1,π¦2)
π π¦1 +π(π¦2) 2
= 0 no error for a sample size of 2!
π π¦1 + π(π¦2) 2 = 0 πΉπ(π¦)[π π¦ ] = 0 matches
π π¦1 = π(π¦2), π¦2 redundant Varπ(π¦1,π¦2)
π π¦1 +π(π¦2) 2
doubles!
π = π¦2
π¦1 π¦2
ππ£πππππ‘ππ: Set of distributions π(π¦1, π¦2) that satisfy π π¦1 = π π¦1 , π π¦2 = π π¦2
π¦1 π¦2
High Variance Low Variance π
1 = π¦3 π¦1 π¦2
ππ£πππππ‘ππ: Set of distributions π(π¦1, π¦2) that satisfy π π¦1 = π π¦1 , π π¦2 = π π¦2
π
2 = ππ¦ + 2π¦π‘ππ(π¦)
High Variance
High Variance Low Variance
π¦1 π¦2
ππ£πππππ‘ππ: Set of distributions π(π¦1, π¦2) that satisfy π π¦1 = π π¦1 , π π¦2 = π π¦2
High Variance
π¦1 π¦2
Low Variance
ππ£πππππ‘ππ: Set of distributions π(π¦1, π¦2) that satisfy π π¦1 = π π¦1 , π π¦2 = π π¦2
1, π 2, β¦
High Variance
Pick a good π for several functions Low variance for similar functions
Low Variance
ππ£πππππ‘ππ: Set of distributions π(π¦1, π¦2) that satisfy π π¦1 = π π¦1 , π π¦2 = π π¦2
π¦1 π¦2
π π½πβΌο Varπ π¦1,π¦2
Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in Neural Information Processing Systems. 2017.
Burda, Yuri, Roger Grosse, and Ruslan Salakhutdinov. "Importance weighted autoencoders." arXiv preprint arXiv:1509.00519 (2015).
Our method VS negative sampling Our method VS i.i.d. sampling Probability of Improvement Log Likelihood Improvement (higher is better)