Statistical physics of agent-based systems: Learning dynamics and complex co-operative behaviour in Minority Games
Tobias Galla
The Abdus Salam International Centre for Theoretical Physics and CNR/INFM SISSA Unit Trieste, Italy
– p. 1/58
Statistical physics of agent-based systems: Learning dynamics and - - PowerPoint PPT Presentation
Statistical physics of agent-based systems: Learning dynamics and complex co-operative behaviour in Minority Games Tobias Galla The Abdus Salam International Centre for Theoretical Physics and CNR/INFM SISSA Unit Trieste, Italy p. 1/58
Tobias Galla
The Abdus Salam International Centre for Theoretical Physics and CNR/INFM SISSA Unit Trieste, Italy
– p. 1/58
Ginestra Bianconi (ICTP), Damien Challet (Oxford), Ton Coolen (London), Andrea De Martino (Rome), Matteo Marsili (ICTP), David Sherrington (Oxford), Yi-Cheng Zhang (Fribourg) support by European Community’s Human Potential Programme, Research Training Network STIPCO
also EVERGROW and COMPLEX MARKETS
– p. 2/58
European Conference on Complex Systems Saïd Business School, Oxford UK, 25-29 September 2006 Satellite workshop on ’Complex Adaptive Systems and Interacting Agents’
Andrea De Martino, Enzo Marinari, David Sherrington and myself
http://chimera.roma1.infn.it/CASIA/
– p. 3/58
... originally a simple model for inductive decision making of agents (El-Farol bar problem) Interest by
economists simple model of a market, stylised facts ... physicists phase transitions, ergodicity breaking, spin glass problem,
mathematicians exact solutions
– p. 4/58
traders
particles, spins, microscopic degrees of freedom
they observe a price time-series (and other information) externally and/or internally generated information based on this they buy/sell interaction price is formed based on their actions macroscopic observable, mean-field they learn and adapt (some better than others maybe) dynamics
– p. 5/58
traders
particles, spins, microscopic degrees of freedom
they observe a price time-series (and other information)
externally and/or internally generated information, history, can be non-Markovian
based on this they buy/sell
decision making (noise ...)
price is formed based on their actions
global interaction, macroscopic observable, mean-field
they learn and adapt (some better than others maybe)
dynamics, update rules, equations of motion
– p. 6/58
[Challet, Zhang 1997]
N traders i = 1, . . . , N given signal µ(t) ∈ {1, . . . , P} at each time-step
here: random external information
then every player has to make a binary trading decision bi(t) ∈ {−1, 1} all players in minority are successful, players in majority unsuccessful if A(t) is the total bid A(t) =
i bi(t), then payoff for i is
−bi(t)A(t)
– p. 7/58
How do players make trading decisions ?
everybody has S trading strategies ai,s, s = 1, . . . , S mapping µ
i ∈ {−1, 1} (buy or sell)
Strategy is a table mapping µ onto binary decision µ 1 2 3 4 ... P aµ
i
1 1
...
Given history µ a strategy table tells me to play aµ
i .
– p. 8/58
Consider case S = 2 strategies per player in the following
strategy s = +1 µ 1 2 3 4 ... P aµ
i,s=+1
1 1
...
strategy s = −1 µ 1 2 3 4 ... P aµ
i,s=−1
1 1
... 1 Then what this player has to decide at time t is which of the two tables to use. Assign scores to each strategy to measure their success.
– p. 9/58
aim: to be in the minority which strategy to use ? The one which has performed best so far ! to assess performance keep a score for each strategy: ui,s(t + 1) = ui,s(t) + (−aµ(t)
i,s A(t))
strategies generated randomly before start of the game
– p. 10/58
[Marsili’s slide]
– p. 11/58
– p. 12/58
What are the interesting observables ? And what are the model parameters ?
– p. 13/58
Model parameters ... just one.
α = number of values information can take
number of agents
= P N
i.e. α high: large information space and/or small market low α means the opposite: large market and/or small information space
– p. 14/58
Predictability
H = 1
P
P
µ=1 A|µ2
H > 0 ⇒ A|µ = 0 statistically predictable H = 0 ⇒ A|µ = 0 predictability zero
global performance/volatility
σ2 =
= −total gain
[Challet, Marsili, Zecchina]
– p. 15/58
Predictability
H = 1
P
P
µ=1 A|µ2
H > 0 ⇒ A|µ = 0 statistically predictable H = 0 ⇒ A|µ = 0 predictability zero
global performance/volatility
σ2 =
= −total gain
Phase transition between a predictable and an unpredictable phase
✻
– p. 16/58
10
−1
10 10
1
1 2
tabula rasa start biased start
non-ergodic, memory ergodic no memory
– p. 17/58
10
−1
10 10
1
1 2
tabula rasa start biased start
non-ergodic, memory ergodic no memory Phase transition between a non-ergodic and an ergodic phase
✻
– p. 18/58
static susceptibilities of CuMn, field-cooling versus zero-field cooling
– p. 19/58
MG shares many features with spin-glass models
HSK =
Jijsisj, J2
ij = 1
N
[Sherrington-Kirkpatrick model, SK 1975]
frustration (not everybody can win) quenched disorder (random strategy assignments) mean-field interactions (interaction with ev’body else)
– p. 20/58
S = 2 strategies per player:
si = +1, score ui+ µ 1 2 3 4 ... P aµ
i,s=+1
1 1
...
si = −1, score ui− µ 1 2 3 4 ... P aµ
i,s=−1
1 1
... 1 Then what this player has to decide at time t is which of the two tables to use. si(t) = sgn[ui+(t) − ui−(t)]
– p. 21/58
ui,+(t + 1) = ui,+(t) − aµ(t)
i,+ A(t)
ui,−(t + 1) = ui,−(t) − aµ(t)
i,− A(t)
✟ ✟ ✟ ✙
total action
❍❍ ❍ ❥
proposed action
Evolution of score difference (qi = ui+ − ui−):
qi(t + 1) = qi(t) −
i,+ − aµ(t) i,−
– p. 22/58
On-line update for score difference (q = u+ − u−): qi(t + 1) = qi(t) −
i,+ − aµ(t) i,−
and A(t) =
f(sgn[qj(t)]|strategies of j) Batch update for score difference (average over µ): qi(t + 1) = qi(t) −
Jijsgn[qj(t)] − hi
quenched disorder, spin glass problem
Jij = 1 P
P
X
µ=1
(aµ
i+ − aµ i−)
2 (aµ
j+ − aµ j−)
2 | {z } Hebbian , hi = 1 P
N
X
j=1 P
X
µ=1
(aµ
i+ − aµ i−)
2 (aµ
j+ + aµ j−)
2
– p. 23/58
qi(t + 1) − qi(t) = −
Jijsgn[qj(t)] − hi
but not
qi(t + 1) − qi(t) = −
Jijqj(t) − hi = −∂H[q] ∂qi
No gradient-descent. No detailed balance. Still pseudo-Hamiltonian:
H(s) = 1 2
Jijsisj +
hisi
– p. 24/58
H(s) = 1 2
Jijsisj +
hisi, Jij = 1 αN
ξµ
i ξµ j
Hopfield model has
H(s) = −1 2
Jijsisj
MG is an ‘unlearning’ game.
– p. 25/58
– p. 26/58
[Heimel, Coolen PRE 2001]
qi(t + 1) − qi(t) = −
Jijsgn[qj(t)] − hi + ϑ(t)
Dynamical partition function
Z[ψ] = Z Dq δ(eq of motion) exp i X
it
ψi(t)sgn[qi(t)] ! = Z DqDb q exp @X
it
b qi(t)[qi(t + 1) − qi(t) + X
j
Jijsgn[qj(t)] + hi − ϑ(t)] 1 A × exp i X
it
ψi(t)sgn[qi(t)] ! Then path integrals, disorder-average, saddle-point equations ...
– p. 27/58
q(t + 1) = q(t) + ϑ(t)−α
[I + G]−1
tt′ sgn[q(t′)] + √αη(t)
with noise covariance
< η(t)η(t′) > = [(I + G)−1D(I + GT )−1]tt′ Dtt′ = 1 + Ctt′
Dynamical order parameters:
Ctt′ = < sgn[q(t)]sgn[q(t′)] >, Gtt′ = ∂ ∂ϑ(t′)< sgn[q(t)] >
[Heimel/Coolen PRE 2001] [Coolen/Heimel J Phys A 2001] [Coolen J Phys A 2005]
– p. 28/58
10
−1
10 10
1
10
2
α
0.2 0.4 0.6 0.8 1
c
q(0)=0.1 q(0)=0.5 q(0)=1 q(0)=10 10
−1
10 10
1
10
2
α
10
−1
10 10
1
σ
q(0)=0 q(0)=1 q(0)=10
EA parameter volatility
exact result [Heimel/Coolen] approximation: drop transients [Heimel/Coolen]
– p. 29/58
Replace
qi(t + 1) − qi(t) = −
Jij sgn[qj(t)]
−hi
by
qi(t + 1) − qi(t) = −
Jij φj(t)
continuous
−hi,
with
φi = qi λ ,
φ2
i = N
[Galla, Coolen, Sherrington J Phys A 2003] [Galla, Sherrington JSTAT 2005]
– p. 30/58
Z =
{si=±1} exp(−βH)
Z =
φ δ( φ 2 − N) exp(−βH)
[Kac, Berlin ‘The Spherical Model of a Ferromagnet’, Phys. Rev. 86, 821-835 (1952)]
– p. 31/58
conventional MG: spherical MG:
10
−110 10
110
2α
0.2 0.4 0.6 0.8 1
c
q(0)=0.1 q(0)=0.5 q(0)=1 q(0)=10
10
−1
10 10
1
α 0.05 0.1 0.15 λ1
exact theory exact theory
10
−110 10
110
2α
10
−110 10
1σ
q(0)=0 q(0)=1 q(0)=10
10
−1
10 10
1
α 1 2 3 4 σ
2
approximation exact theory
– p. 32/58
– p. 33/58
ui,s(ℓ + 1) = ui,s(ℓ) − aµ(ℓ)
i,s Aµ(ℓ)(ℓ)
batch learning: strategy switches allowed only after O(αN) steps
ui,s(t + 1) = ui,s(t) − 1 αN
αN
aµ
i,sAµ(t)
Does it make a difference ?
– p. 34/58
Not in the standard MG:
10
−1
10 10
1
α
1 2 3
σ
batch
– p. 35/58
But in an MG with anti-correlated strategy assignments it does:
10
−1
10 10
1
α
10
−1
10 10
1
σ
2
|qi(0)|=0 |qi(0)|=0.5 |qi(0)|=1.0 |qi(0)|=2.0
10
−1
10 10
1
α
10
−1
10 10
1
σ
2
batch
[Sherrington, Galla Physica A 2003] [Galla, Sherrington EPJB 2005]
– p. 36/58
Interpolation between on-line and batch: updates every M time-steps
10
−1
10 10
1
α
10 10
1
σ
2
batch
M=0.1P M=0.5P M=P M=2P M=5P M=10P M=100P
– p. 37/58
χ =
G(τ) =
infi nite ?
10 20 30 40 50
τ
0.5 1 0.5 1 0.5 1 0.5 1 0.5 1 0.5 1
α < αc, χ = ∞, H = 0
non-ergodic, perturbations persists, memory
α > αc, χ < ∞, H > 0
ergodic, perturbations decay, no memory
– p. 38/58
No replica symmetry breaking in standard MG.
[Marsili]
– p. 39/58
0.5 1 1.5
α
0.2 0.4 0.6 0.8 1
c RSB RS
AT−line
0.1 1.0 10.0 100.0
α
0.0 0.2 0.4 0.6 0.8
η
RS RSB
αc
dilute MG MG with impact correction
[Galla JSTAT 2005] [Heimel, De Martino, J Phys A 2001] [De Martino, Marsili J Phys A 2001] also in El-Farol with heterogeneous resource level [De Sanctis, Galla, in preparation]
– p. 40/58
This is all very nice ...
– p. 41/58
This is all very nice ... but does one see anything like feature of real-market data in this model ?
– p. 42/58
This is all very nice ... ... but does one see anything like feature of real-market data in this model ?
– p. 43/58
– p. 44/58
– p. 45/58
40000 45000 50000 time −40 −20 20 40 A(t)
−30 −10 10 30
return A
0.1 0.2
P(A)
– p. 46/58
What are the minimal additions one has to make to make it more realistic ? give the agents the choice not to play -> grand-canonical MGs give them dynamically evolving capitals Both things have similar effects: the trading volume is no longer constant (= N up to now), but can evolve in time.
– p. 47/58
critical region at fi nite N:
stylised facts+interesting dynamical features
❅ ❅ ❅ ❅ ❅ ■
anomalous phase in th.dyn. limit first order transition line
– p. 48/58
❅ ❅ ❅ ❅ ❅ ■
efficient phase H = 0
✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✏ ✮
– p. 49/58
10000 11000 12000 13000 14000 15000 time 50 100 150 200 Ns
act
−100 −50 50 100 r(t)
[Challet, Marsili, Zhang 2001]
– p. 50/58
MG with 2 strategies per player and dynamical capitals:
[Challet, Chessa, Marsili, Zhang (2001)]
1 10
α
0e+00 1e−03 2e−03 3e−03 <r
2> exponentialW 10
−3 exponential<r
2> powerW 10
−3 power−0.1 0.1 x=[p(t+dt)−p(t)]/σ 10
−410
−210 10
210
4p(x)
α=0.32 dt=1 dt=4 dt=16 dt=64 −0.1 0.1 x=[p(t+dt)−p(t)]/σ α=0.64
1 10 100 time lag 10
−310
−210
−110 absolute returns autocorrelation α=0.80 α=0.64 y=x
−0.64stylised facts, but only close to/below the phase transition
– p. 51/58
MG with 2 strategies per player and dynamical capitals:
complicated/tedious:
and slow ones (the capitals)
– p. 52/58
Simple MG with dynamical capitals:
ci(t + 1) = ci(t) − εci(t)
investment
aµ(t)
i
A(t) V (t)
Similar to a replicator system with random couplings.
[T. Galla, ‘Random replicators with Hebbian interactions’, JSTAT 2005]
– p. 53/58
One strategy only per player - exact analytical solution: Transition persists, and wealth → ∞ at transition in the infinite system
1 10
α
0e+00 1e−03 2e−03 3e−03 <r
2> exponentialW 10
−3 exponential<r
2> powerW 10
−3 power10
−1
10 10
1
α 0.5 1 1.5
v’ (unimodal) H’ (unimodal) v’ (exponential) H’ (exponential) theory
– p. 54/58
Distribution of returns (re-scaled to unit variance):
−10 −5 5 10
r/σ
10
−6
10
−4
10
−2
10
p(r/σ)
α=1.5 α=0.6 α=0.27 α=0.2<αc standard Gaussian
Gaussian far from transition, but fat-tailed near and below.
– p. 55/58
Distribution of wealth (re-scaled to unit variance):
1 2 3 4 5 6
c/σ
10
−4
10
−3
10
−2
10
−1
10
p(c/σ)
α=1.5 α=1 α=0.8 α=0.5 α=0.27 standard Gaussian 1 2 3 4 5 6 7
c/σ
10
−6
10
−4
10
−2
10
p(c/σ)
Gaussian α=0.2, N=100 α=0.2, N=200 α=0.2, N=300
Fat tailed non-Gaussian distribution not a finite-size effect below transition ?
– p. 56/58
Tax revenue as function of trading fee
10
−3
10
−1
10
1
ε
0.01 0.02 0.03 0.04 0.05 0.06
Rs theory Rs simulation Rp=εp Rtot=Rs+Rp theory
[Bianconi, Galla, Marsili 2006] [Galla, Zhang in progress]
– p. 57/58
MG has attracted attention from physics, mathematics and economics physics: spin glass problem with off-equilibrium dynamics
solution in non-ergodic region critical exponents, RG ... relation to spin-glass models and Hopfield model also to do: find more realistic extensions which are still analytically tractable
– p. 58/58