SLIDE 1 Records in a changing world
Joachim Krug Institute of Theoretical Physics, University of Cologne
- What are records, and why do we care?
- Record-breaking temperatures and global warming
- Correlations between record events
- Records in random walks and the stock market
with Gregor Wergen, Jasper Franke and Miro Bogner
SLIDE 2
The fascination of records
SLIDE 3
The fascination of records
Most expensive transfer in the history of soccer
SLIDE 4
The fascination of records
World’s tallest building from 1880-1884
SLIDE 5 The world’s tallest buildings over time
100 200 300 400 500 600 700 800 900 1840 1860 1880 1900 1920 1940 1960 1980 2000 height (m) year Tour Eiffel Empire State Strasbourg CN Tower
SLIDE 6
Temperature records
SLIDE 7
The 2010 summer heat wave
http://www.spiegel.de/
SLIDE 8
The 2010 summer heat wave
http:// limateprogress.org/2010/07/05/heat-wave-global-warming/
SLIDE 9 Temperature records in the USA
http://www.u ar.edu/news/releases/2009/maxmin.jsp
based on G.A. Meehl et al., Geophys. Res. Lett. 36 (2009) L23701
SLIDE 10 Daily temperature at Parque del Retiro, Madrid, on October 24
5 10 15 20 25 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Daily maximum temperature [C] - October 24 Year (1920-2010) Madrid, Parque del Retiro Measurements Upper record value Lower record value
- 7 upper records and 6 lower records in 90 years
- How many records should we expect if the climate did not change?
SLIDE 11 Mathematical theory of records I
- N. Glick, Am. Math. Mon. 85, 2 (1978)
- A record is an entry in a sequence of random variables (RV’s) Xn which is
larger (upper record) or smaller (lower records) than all previous entries
- Example: 1000 independent Gaussian random variables
⇒ 7 upper records
SLIDE 12 Mathematical theory of records II
- If the RV’s are independent and identically distributed (i.i.d.), the probability
for a record at time n is P
n = 1/n by symmetry
- The expected number of records up to time n is therefore
Rn =
n
∑
k=1
1 k = ln(n)+γ +O(1/n)
where γ ≈ 0.5772156649.... is the Euler-Mascheroni constant,
- Because record events are independent, the variance of the number of
records is
(Rn −Rn)2 =
n
∑
k=1
1 k − 1 k2
6 +O(1/n)
- In a constant climate we expect 5 records in 90 years and only 7.5 records
in 1000 years!
SLIDE 13 Record-breaking temperatures and global warming
R.E. Benestad (2003); S. Redner & M.R. Petersen (2006)
- Question: Does global warming significantly increase the occurrence of
record-breaking high daily temperatures?
The temperature on a given calendar day of the year is an independent Gaussian RV with constant standard deviation σ and a mean that increases at speed v
- Typical values: v ≈ 0.03oC/yr, σ ≈ 3.5oC ⇒ v/σ ≪ 1
SLIDE 14 The linear drift model
- R. Ballerini & S. Resnick (1985); J. Franke, G. Wergen, JK, JSTAT (2010) P10013
- General setting: Time series Xn = Yn +vn with i.i.d. RV’s Yn and v > 0
- For large n the record probability approaches a finite limit limn→∞P
n(v) > 0
SLIDE 15 Approximate calculation of the record rate for small drift
- Let Yn have probability density p(y) and probability distribution function
q(x) =
xdy p(y). Then
P
n(v) =
n−1
∏
k=1
q(xn −vk) =
n−1
∏
k=1
q(x+vk)
- For small v we have q(x+vk) ≈ q(x)+vkp(x)
⇒ P
n ≈
2
n +vIn
with In = n(n−1)
2
dx p(x)2q(x)n−2
- For the Gaussian distributon a saddle point approximation for large n yields
P
n(v) ≈ 1
n + v σ 2√π e2
SLIDE 16 Comparison to temperature data
- G. Wergen, JK, EPL 92 30008 (2010); G. Wergen, A. Hense, JK, arXiv:1210.5416
SLIDE 17 Data sets
European data
- Daily temperatures from 187 stations over 30 year period 1976-2005
- Constant warming rate v ≈ 0.047±0.003oC/yr, standard deviation
σ ≈ 3.4±0.3oC ⇒ v/σ ≈ 0.014
- Gridded re-analysis data from the E-OBS project for 1950-2010 yields
v/σ ≈ 0.012 for 30 year period 1980-2010
American data
- Daily temperatures from 207 stations over 30 year period 1976-2005
- Lower warming rate and larger variability due to continental climate:
σ = 4.9±0.1oC, v = 0.025±0.002oC/yr ⇒ v/σ ≈ 0.005
- Monthly data from 1217 stations over 50 year period 1960-2010 display
similar trend but lower variability ⇒ v/σ ≈ 0.010
SLIDE 18 European data 1976-2005: Mean daily maximum temperature
Full line: Sliding 3-year average
SLIDE 19
No trend in the standard deviation
SLIDE 20
Temperature fluctuations are Gaussian
SLIDE 21 Record frequency in Europe: 1976-2005
- Expected number of records in stationary climate: 365
30 ≈ 12
- Observed record rate is increased by about 40 % ⇒ 5 additional records
SLIDE 22
Mean record number: 1976-2005
SLIDE 23
Correlations between record events
SLIDE 24 Records from broadening distributions
JK, JSTAT (2007) P07001
- RV’s Xn drawn from pn(x) = n−α f(x/nα) with α > 0
- Simulations indicate sub-Poissonian fluctuations in the number of records,
indicating that record events repel each other Example: Uniform distribution
SLIDE 25 Record correlations in the linear drift model
- G. Wergen, J. Franke, JK, J. Stat. Phys. 144 (2011) 1206
- Consider the quantity
lN,N−1(v) = P
N,N−1
P
NP N−1
with P
N,N−1 = Prob[XN record and XN−1 record]
- lN,N−1(0) = 1 and lN,N−1(v) ≡ 1 for Gumbel-distributed i.i.d part
- limN→∞lN,N−1(v) exists for v > 0 but not necessarily for v < 0
- Small v expansion yields lN,N−1(v) ≈ 1+vJN(v) with
JN ≈ −1 2N4 d dN 2 N2IN
where κ is the extreme value index of p(x) ∼ (1+κx)−κ+1
κ
SLIDE 26
- Records cluster (repel) for distributions broader (more narrow) than an
exponential:
0.5 1 1.5 2 2.5 3 3.5 1 10 100 1000 10000 100000 Correlations lN,N-1(c) Number of events N Pareto (µ=2): c = 0.05 Pareto (µ=3): c = 0.05 Pareto (µ=4): c = 0.05
- Str. Exp. (0.5): c = 0.05
Exponential: c = 0.05 Gaussian: c = 0.05 Uniform: c = 0.05 1 - no drift (c=0)
SLIDE 27 A record-based test for heavy tailed distributions
- J. Franke, G. Wergen, JK, PRL 108 (2012) 064101
- Identifying heavy tailed distributions from small data sets is generally
problematic
Clauset, Shalizi, Newman, SIAM Rev. 51 661 (2009)
- For a set of N variables expected to be i.i.d, choose many random subsets
- f size n < N
- Add a linear trend to the data in each subset and compute the normalized
joint record probability ln,n−1 as a function of the drift strength v
- Then ln,n−1(v) > 1 implies strong evidence for heavy tailed behavior
- The test is non-parametric and robust against the removal of outliers
- Fluctuations caused by the smallness of the original data set can be
reduced (to some extent) by the combinatorial proliferation of subsets
SLIDE 28 Example 1: Gaussian vs. Lévy RV’s for N = 64, n = 16
0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 0.5 1 1.5 2 2.5 3 l16, 15(c) c Gaussian, σ=1.0 Levy-stable, µ=1.3 From independent Gaussian RVs From independent Levy RVs 0.2 0.4 0.6 0.8 1
2 F(x) x
SLIDE 29 Example 2: Subsets of size N = 64 from ISI citation data
source: S. Redner, Eur. Phys. J. B 4 131 (1998) 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2 4 6 8 10 12 14 16 l16,15(c) c dataset 1 dataset 2 dataset 3 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 12 14 16 18 20 22 ln, n-1(c*) n
SLIDE 30 Record correlations induced by rounding effects
- G. Wergen, D. Volovik, S. Redner, JK, PRL 109, 164102 (2012)
2 4 6 8 10 12 14 16 18 1 1e+10 1e+20 1e+30 1e+40 1e+50 1e+60 R∆
n
n ∆=1 ∆=2 ∆=4
- Analyt. ∆=1
- Analyt. ∆=2
- Analyt. ∆=4
Number of strong records from rounded i.i.d. Gaussian RV’s
SLIDE 31
Random walks & market fluctuations
SLIDE 32 Records in random walks
5 10 15 20 25 30 100 200 300 400 500 600 700 800 900 1000 X n
⇒ 65 records in 1000 time steps
SLIDE 33 Records in random walks
S.N. Majumdar & R.M. Ziff, PRL 101, 050601 (2008)
- Simple one-dimensional random walk is defined by
Xn =
n
∑
k=1
ηk
with i.i.d. RV’s ηk drawn from a symmetric, continuous distribution φ(η)
- Based on a theorem of Sparre Andersen (1954), the probability of having
m records in n steps is found to be P(m,n) = 2n−m+1 n
1 √πn exp[−m2/4n]
- Mean number of records: Rn ≈
- 4n/π ≫ lnn+γ
- This result does not require φ(η) to have finite variance
⇒ valid also for superdiffusive (Lévy) walks!
SLIDE 34 Biased random walks and stock market fluctuations
- G. Wergen, M. Bogner, JK, PRE 83 051109 (2011)
- Basic model of a fluctuating stock price Sn is the geometric random walk
Sn = eXn = exp[
n
∑
k=1
ηk]
which obviously has the same record statistics as the random walk itself.
- Stock prices typically display an upward bias reflecting the long-term
interest rate ⇒ consider random walk with drift: Xn → Xn +vn
- Leading order expansion in v yields
Rn ≈
π + v σ √ 2 π
π + vn √ 2σ
- For n → ∞ the record probability P
n approaches a positive constant
SLIDE 35 The S&P 500 index 1.1.1990-31.3.2009
20 40 60 80 100 120 1000 2000 3000 4000 5000 Stock price S(n) Trading days n (starting from 01/01/1990) CHEVRON PFIZER WAL MART WALT DISNEY
SLIDE 36 The S&P 500 index 1.1.1990-31.3.2009
0.5 1 1.5 2 2.5 3 3.5 4 1000 2000 3000 4000 5000
- Log. of stock price Log(S(n)/S(1))
Trading days n (starting from 01/01/1990) CHEVRON PFIZER WAL MART WALT DISNEY
- logarithmic stock prices with linear fits
SLIDE 37 Upper and lower records in the S&P 500
20 40 60 80 100 120 140 160 180 200 1000 2000 3000 4000 5000 Number of Record mn Trading days (starting from 01/01/1990)
- St. and P. - Upper records
- St. and P. - Lower records
mn(0) mn(0) + 0.025/sqrt(2)*x
- Sim. v=0.025 - Upper rec.
- Sim. v=0.025 - Lower rec.
- Record events averaged over 366 stocks
- Excess of upper records well predicted by analytic model with v/σ = 0.025
SLIDE 38 Conclusions
- Records statistics as a paradigm of non-stationary dynamics
- f rare events
- Applications to random phenomena of broad societal interest
(sports, the weather, the stock market...)
- Global warming affects the rate of record-breaking temperatures
in moderate but significant way
- Minimal model of biased random walk accounts quantitatively
for the occurrence of upper records in the S&P 500
setting generally induce correlations between record events that can be attractive or repulsive