Bounded Rationality in Finite Automata Christos A. Ioannou - - PowerPoint PPT Presentation
Bounded Rationality in Finite Automata Christos A. Ioannou - - PowerPoint PPT Presentation
Bounded Rationality in Finite Automata Christos A. Ioannou University of Vienna February 18, 2011 Overview Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1 Table: Prisoners Dilemma Matrix Finite Automata The strategies of the
Overview
Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1
Table: Prisoner’s Dilemma Matrix
Finite Automata
The strategies of the agents are represented by Moore machines.
Genetic Algorithm
The GA utilizes Darwinian mechanics.
Bounded Rationality
Machines commit errors in the implementation of actions. Machines commit errors in the perception of actions.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 2 / 29
Related Literature
Optimization routines
Genetic Algorithm - Holland (1975) Simulated Annealing - Kirkpatrick, Gelatt & Vecche (1983) Tabu Search - Glover & Laguna (1993)
Finite Automata
Abreu & Rubinstein (1988) Banks & Sundaram (1990)
Axelrod’s seminal work
Computational simulations pinpoint to Tit-For-Tat. Bendor, Kramer & Stout (1991) Is then the evolution of cooperation inevitable?
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 3 / 29
Objectives
How does bounded rationality impact the evolution of cooperation? How do different levels of errors impact the automata? What characteristics do the automata exhibit under these different error-levels? What are the prevailing (surviving) automata under these different error-levels? What automaton would you pick to play this game?
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 4 / 29
Results
1 The evolution of cooperation becomes less likely as implementation
and perception errors become more likely.
2 The study identifies a threshold error-level.
At and above the threshold error-level, the prevailing structures converge to the open-loop automaton Always-Defect. Below the threshold, the prevailing automata are closed-loop and diverse.
3 Prevailing automata tend to be less complex, exhibit low reciprocal
cooperation and low tolerance to defections as the likelihood of errors increases.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 5 / 29
Thought Experiment
30 agents are to play the PD game. Agents initially, randomly choose a strategy and play the game against each other in a round-robin structure. With the completion of all round-matches, the strategies and scores become common knowledge. Based on this information, each agent is allowed to adjust her strategy for the next generation. Agents choose their new strategies, and a new generation is initiated.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 6 / 29
Moore Machines
A Moore machine for player i in an infinitely repeated game G, is a four-tuple (Qi, qi
0, f i, τi) where
Qi is a finite set of internal states, qi
0 is specified to be the initial internal state,
f i : Qi → Ai is an output function that assigns an action to every state, τi : Qi × A−i → Qi is the transition function that assigns a state to every pair of a state and the opponent’s action (A−i).
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 7 / 29
Example
C C,D D start C D
Figure: Grim-Trigger Machine
Qi = {qC, qD} qi
0 = qC
f i(qC) = C and f i(qD) = D τi(q, a−i) = {qC
(q,a−i)=(qC ,C) qD
- therwise
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 8 / 29
Errors
Implementation Errors
The machine of agent i in the PD game commits an implementation error with probability ǫ, when for any given state q, the machine′s
- utput function returns the action f i(q) with probability 1 − ǫ and draws
another action “f i(q)” where f i(q)=“f i(q)” otherwise.
Perception Errors
The machine of agent i in the PD game commits a perception error with probability δ, when for any given opponent′s action a−i, the machine inputs the opponent′s action a−i into the transition function with probability 1 − δ and inputs the opponent′s action “a−i” into the transition function where a−i=“a−i” otherwise.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 9 / 29
The Genetic Algorithm
The GAs operate on 3 fundamental principles. The algorithms require:
1
A coding of the parameter set.
2
The assignment of a measure of performance on each machine.
3
The imposition of genetic operators onto the machines.
The selection-dynamics reflect the limited ability of the agents to receive, decode and act upon the information they get in the course
- f the evolution.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 10 / 29
Coding of Strings
A Moore machine is defined by a string of 25 elements.
C D D C start C D
Figure: Tit-For-Tat Machine
- start
1 0 1
state 0
0 0 1
state 1
0 0 0
state 2
0 0 0
state 3
0 0 0
state 4
0 0 0
state 5
0 0 0
state 6
0 0 0
state 7
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 11 / 29
The (Pseudo)code
Specify error-level Fix max-periods = 200 Create initial population: 30 agents (seed randomly) Initiate round-robin tournament For t = 1 to 500 do For all agent-pairs do For p = 1 to max-periods do Award utils to each agent based on the PD matrix End loop Output performance score End loop Apply subroutine for the offspring-population-creation Store agent results End loop
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 12 / 29
Subroutine for the Creation of the Offspring Population
Sort agents based on performance score Copy top 20 agents to offspring-population Select 10 agent-pairs via probabilities biased by performance scores For each of 10 pairs do Create new agent as a copy of the winner of the pair’s match Mutate new agent by switching one element at random End loop
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 13 / 29
Time-Homogeneous Markov Chains
A population is inhabited by clones playing the PD game. Consider a system that in time-step n can be in one of four possible states in state space S = (s1, s2, s3, s4). Let a strategy have transition rule p = (p1, p2, p3, p4) where 0 ≤ pi ≤ 1 denotes the probability of cooperating after the corresponding outcome of the previous period. si is 1 if the strategy plays Cooperate, and 0 if the strategy plays Defect after outcome i (i = 1, 2, 3, 4).
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 14 / 29
Transition Rule for STFT & SALLD with Errors
Let ǫ be the probability of committing an implementation error. Let δ be the probability of committing a perception error. STFT: (1, 0, 1, 0) → (1 − δ − ǫ(1 − 2δ), δ + ǫ(1 − 2δ), 1 − δ − ǫ(1 − 2δ), δ + ǫ(1 − 2δ)) SALLD: (0, 0, 0, 0) → (ǫ, ǫ, ǫ, ǫ)
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 15 / 29
Transition Matrix
Rule p = (p1, p2, p3, p4) matched against rule q = (q1, q2, q3, q4) yields a Markov process where the transition probabilities between the four possible states are given by the following matrix: p1q1 p1(1 − q1) (1 − p1)q1 (1 − p1)(1 − q1) p2q3 p2(1 − q3) (1 − p2)q3 (1 − p2)(1 − q3) p3q2 p3(1 − q2) (1 − p3)q2 (1 − p3)(1 − q2) p4q4 p4(1 − q4) (1 − p4)q4 (1 − p4)(1 − q4) The system has time-homogeneous transition probabilities and the Markov property.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 16 / 29
Stationary Distribution
Since p and q are in the interior of the cube, all the entries of the matrix are strictly positive; hence there exists a unique stationary distribution π = (π1, π2, π3, π4) for n → ∞. The payoff for a player i using p against a player −i using q is given by: A(p, q) = 3π1 + 5π3 + π4 (1) Assuming that implementation errors and perception errors are each kept constant at 4%, the invariant distributions of Tit-For-Tat and Always-Defect yield: A(STFT, STFT) = 2.25 A(SALLD, SALLD) = 1.12
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 17 / 29
Computational Treatments
Treatment @ 4%
The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 4%.
Treatment @ 2%
The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 2%.
Treatment @ 1%
The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 1%.
Treatment @ 0%
The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 0%.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 18 / 29
Evolution of Payoffs
1 The incorporation of errors is sufficient to alter the evolution of
cooperative outcomes.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 19 / 29
Prevailing Machines
The open-loop machine Always-Defect was the clear winner in the treatments at 4% and 2%.
C,D start
D
Figure: Always-Defect
The structures that prevailed in the treatments at 1% and 0% were quite diverse and closed-loop.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 20 / 29
Summary Measures
A state is accessible if, given the automaton’s starting state, there is some possible combination of the opponent’s possible actions that will result in a transition in that state. The cooperation-reciprocity is the proportion of accessible states that respond to an observed cooperation by the opponent with a cooperation. The defection-reciprocity is the proportion of accessible states that respond to an observed defection by the opponent with a defection.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 21 / 29
Accessible States
1 The average number of accessible states is decreasing in the
probability of errors.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 22 / 29
Cooperation-Reciprocity
2 Cooperation-reciprocity is decreasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 23 / 29
Defection-Reciprocity
3 Defection-reciprocity is increasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 24 / 29
Machine Characteristics
Characteristic 1
The average number of accessible states is decreasing in the probability of errors.
Characteristic 2
Cooperation-reciprocity is decreasing in the probability of errors.
Characteristic 3
Defection-reciprocity is increasing in the probability of errors.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 25 / 29
Example 1
Rubik’s cube has 43 trillion possible initial positions. Minimizing the number of moves to solve the cube would require an extremely complex pattern of adjustment from one particular scrambled position to another. Cube experts have developed rigidly-structured-solving procedures that employ a small repertoire of solving patterns to unscramble the cube.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 26 / 29
Example 2
This was the first book on card-counting. The book emphasized sophisticated card-counting and bet-variation methods. Later books steadily evolved towards more rigidly-structured methods.
No Need to Count (1980) Winning Casino Blackjack for the Non-Counter (1981)
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 27 / 29
Revisiting Bendor, Kramer & Stout (1991)
The success of NAF is limited to the particular ecology. NAF’s generocity lacks generalizability.
Difference-aversion models Reciprocity models Competitive models
Laboratory research has identified several factors that might diminish generosity among human strategists.
Dal Bo & Frechette (2011) - “... these results cast doubt on the common assumption that agents will make the most of the opportunity to cooperate whenever it is possible to do so in equilibrium.”
Summary measures of the present study point to a very different direction.
The evolving machines exhibit readiness to sneak on the opponent. The evolving machines are relentless punishers of defections.
Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 28 / 29
Summary Results
1 The evolution of cooperation becomes less likely as implementation
and perception errors become more likely.
2 The study identifies a threshold error-level.
At and above the threshold error-level, the prevailing structures converge to the open-loop automaton Always-Defect. Below the threshold, the prevailing automata are closed-loop and diverse.
3 Prevailing machines tend to be less complex as the likelihood of errors
increases.
4 Cooperation-reciprocity is decreasing in the probability of errors. 5 Defection-reciprocity is increasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 29 / 29