Asymptotics Review Harvard Math Camp - Econometrics Ashesh - PowerPoint PPT Presentation

Asymptotics Review Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018

Outline Types of Convergence Almost sure convergence Convergence in probability Convergence in mean and mean-square Convergence in distribution How do they relate to each other? Slutsky’s Theorem and the Continuous Mapping Theorem O p and o p Notation Law of Large Numbers Central Limit Theorem The Delta Method

Why Asymptotics? Can we still say something about the behavior of our estimators without strong, parametrics assumptions (e.g. i.i.d. normal errors)? We can in large samples . ◮ How would my estimator behave in very large samples? ◮ Use the limiting behavior of our estimator in infinitely large samples to approximate its behavior in finite samples. Advantage: As the sample size gets infinitely large, the behavior of most estimators becomes very simple. ◮ Use appropriate version of CLT... Disadvantage: This is only an approximation for the true, finite-sample distribution of the estimator and this approximation may be quite poor. ◮ Two recent papers by Alwyn Young: “Channelling Fisher” and “Consistency without Inference.”

Stochastic Convergence Recall the definition of convergence for a non-stochastic sequence of real numbers. ◮ Let { x n } be a sequence of real numbers. We say n →∞ x n = x lim if for all ǫ > 0, there exists some N such that for all n > N , | x n − x | < ǫ . We want to generalize this to the convergence of random variables and there are many ways to do so.

Almost sure convergence The sequence of random variables { X n } converges to the random variable X almost surely if P ( { ω ∈ Ω : lim n →∞ X n ( ω ) = X ( ω ) } ) = 1 . We write a . s − → X . X n

Almost sure convergence: In English For a given outcome ω in the sample space Ω, we can ask whether n →∞ X n ( ω ) = X ( ω ) lim holds using the definition of non-stochastic convergence. If the set of outcomes for which this holds has probability one, then a . s . X n − − → X .

Convergence in probability The sequence of random variables { X n } converges to the random variable X in probability if for all ǫ > 0, n →∞ P ( | X n − X | > ǫ ) → 0 . lim We write p − → X . X n

Convergence in probability: In English Fix an ǫ > 0 and compute P n ( ǫ ) = P ( | X n − X | > ǫ ) . This is just a number and so, we can check whether P n ( ǫ ) → 0 using the definition of non-stochastic convergence. p If P n ( ǫ ) → 0 for all values ǫ > 0, then X n − → X .

Convergence in mean and mean-square The sequence of random variables { X n } converges in mean to the random variable X if n →∞ E [ | X n − X | ] = 0 . lim We write m − → X . X n { X n } converges in mean-square to X if n →∞ E [ | X n − X | 2 ] = 0 . lim We write m . s . − − → X . X n

Convergence in distribution Let { X n } be a sequence of random variables and F n ( · ) is the cdf of X n . Let X be a random variable with cdf F ( · ). { X n } converges in distribution , weakly converges or converges in law to X if n →∞ F n ( x ) = F ( x ) lim for all points x at which F ( x ) is continuous. There are many ways of writing this d X n − → X L X n − → X X n = ⇒ X . d We’ll use X n − → X .

Convergence in distribution: In English Convergence in distribution describing the convergence of the cdfs. It does not mean that the realizations of the random variables will be close to each other. Recall that F ( x ) = P ( X ≤ x ) = P ( { ω ∈ Ω : X ( ω ) ≤ x } ) As a result, F n ( x ) → F ( x ) does not make any statement about X n ( ω ) getting close to X ( ω ) for any ω ∈ Ω.

Convergence in distribution: Continuity? Why is convergence in distribution restricted to the continuity points of F ( x )? Example: Let X n = 1 / n with probability 1 and let X = 0 with probability one. Then, F n ( x ) = 1( x ≥ 1 / n ) F ( x ) = 1( x ≥ 0) with F n (0) = 0 for all n while F (0) = 1. ◮ As n → ∞ , X n is getting closer and closer to X in the sense that for all x � = 0, F n ( x ) is well approximated by F ( x ) but NOT at x = 0! ◮ If we did not restrict convergence in distribution to the continuity points, strange case where a non-stochastic sequence { X n } converges to X under the non-stochastic definition of convergence but not converge in distribution.

Multivariate Convergence We can extend each of these definitions to random vectors. ◮ The sequence of random vectors { X n } a . s − → X if each element of X n converges almost surely to each element of X . Analogous for convergence in probability. ◮ A sequence of random vectors converges into distribution to a random vector if we apply the definition above to the joint cumulative distribution function. Cramer-Wold Device : Let { Z n } be a sequence of k -dimensional d d → Z if and only if λ ′ Z n → λ ′ Z for all random vectors. Then, Z n − − λ ∈ R k . ◮ Simpler characterization of convergence in distribution for random vectors.

How do they relate to each other? How do these different definitions of stochastic convergence relate to each other? See picture below. ◮ We will skip the results but see the notes if you want more details.

Counter-examples Almost sure convergence does not imply convergence in mean. Example : Let X n be a random variable with P ( X n = 0) = 1 − 1 n 2 P ( X n = 2 n ) = 1 n 2 . as X n − → 0 but E [ X n ] does converge in mean to 0.

Counter-examples Almost sure convergence does not imply convergence in mean square. Example : Let X n be a random variable with P ( X n = 0) = 1 − 1 n 2 P ( X n = n ) = 1 n 2 as → 0 but E [ X 2 Then, X n − n ] = 1 for all n .

Slutsky’s Theorem d Slutsky’s Theorem : Let c be a constant. Suppose that X n − → X p and Y n − → Y . Then, d 1. X n + Y n → X + c . − d 2. X n Y n − → Xc . d 3. X n / Y n − → X / c provided that c � = 0. p If c = 0, then X n Y n − → 0.

Continuous Mapping Theorem Continuous Mapping Theorem : Let g be a continuous function. Then, → X , then g ( X n ) d d 1. If X n − − → g ( X ). p p 2. If X n − → X , then g ( X n ) − → g ( X ).

big- O , little- o Recall big- O and little- o notation for sequences of real numbers. ◮ Let { a n } and { g n } be sequences of real numbers. We have that a n a n = o ( g n ) if lim = 0 g n n →∞ and | a n a n = O ( g n ) if | < M ∀ n . g n We also extend big- O and little- o notation to random variables

O p and o p definition Suppose { A n } is a sequence of random variables. We write A n p A n = o p ( G n ) if − → 0 G n and A n = O p ( G n ) if for all ǫ > 0, there exists M ∈ R such that P ( | A n G n | < M ) > 1 − ǫ for all n . p ◮ Often see X n = X + o p (1) to denote X n − → X .

Asymptotics Review Harvard Math Camp - Econometrics Ashesh - PowerPoint PPT Presentation

Asymptotics Review Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018 Outline Types of Convergence Almost sure convergence Convergence in probability Convergence in mean and mean-square Convergence in distribution How do they

Asymptotics of symmetric functions with applications to Setup Asymptotics of statistical

Asymptotics Will Perkins January 22, 2013 Asymptotics In many theorems and questions in

Statistical mechanics via Answers: GUE asymptotics of symmetric functions Probability via Schur

Foundations of Computer Science Lecture 9 Sums And Asymptotics Computing Sums Asymptotics:

Wald Test Asymptotics of LRT Lecture 21 Biostatistics 602 - Statistical Inference . . . .

What is this talk about? Applied Asymptotics in R an R package bundle Examples of the use of

Asymptotics of radiation fields in asymptotically Minkowski spacetimes Dean Baskin joint with

Cauchy-Riemann (CR) Manifolds, Szeg o kernel asymptotics and Morse inequalities on CR-manifolds

Data Asymptotics Dr. Jarad Niemi STAT 544 - Iowa State University February 7, 2018 Jarad Niemi

Sum of matrix entries of representations of the symmetric group and its asymptotics Dario De

On Third-Order Asymptotics for DMCs Vincent Y. F. Tan Institute for Infocomm Research (I 2 R)

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

M-Estimation under High-Dimensional Asymptotics DLD, Andrea Montanari 2014-05-01 DLD, Andrea

Asymptotics of Pattern Classes of Set Partition and Permutation d -tuple Avoidance Benjamin Gunby

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

Exact asymptotics for linear processes Magda Peligrad University of Cincinnati October 2011

Using CMT in the LCG AA (nightly) builds Andreas Pfeiffer SPI Jan 31, 2007 Andreas Pfeiffer,

Breakout Session: PreManage Agenda Overview of PreManage and the ACT Team Pilot Justin

Niagara(T1) A CMT PROCESSOR Rao Shoaib Solaris Core Technology group rao.shoaib@sun.com Agenda:

Chip Multi-threading and Chip Multi-threading and Sun s Niagara-series s Niagara-series

Delay Aware Packet Scheduling (DAPS) and receivers buffer blocking in CMT-SCTP Nicolas KUHN 1 ,

Advisory Panel on Rare Disease Winter 2015 Meeting Arlington, VA January 13, 2015 9:30 a.m.

Security Price and Your Bottom Line Presented by Scott McCormick, CMT DYAD Capital Management

RAFFLES CITY SINGAPORE 2Q 2020 Financial Results 22 July 2020 Important Notice This presentation