Bounded Rationality in Finite Automata Christos A. Ioannou - - PowerPoint PPT Presentation

▶

Jul 02, 2023 21 likes •317 views

Bounded Rationality in Finite Automata Christos A. Ioannou University of Vienna February 18, 2011 Overview Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1 Table: Prisoners Dilemma Matrix Finite Automata The strategies of the

SLIDE 1

Bounded Rationality in Finite Automata

Christos A. Ioannou University of Vienna February 18, 2011

SLIDE 2

Overview

Cooperate Defect Cooperate 3,3 0,5 Defect 5,0 1,1

Table: Prisoner’s Dilemma Matrix

Finite Automata

The strategies of the agents are represented by Moore machines.

Genetic Algorithm

The GA utilizes Darwinian mechanics.

Bounded Rationality

Machines commit errors in the implementation of actions. Machines commit errors in the perception of actions.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 2 / 29

SLIDE 3

Related Literature

Optimization routines

Genetic Algorithm - Holland (1975) Simulated Annealing - Kirkpatrick, Gelatt & Vecche (1983) Tabu Search - Glover & Laguna (1993)

Finite Automata

Abreu & Rubinstein (1988) Banks & Sundaram (1990)

Axelrod’s seminal work

Computational simulations pinpoint to Tit-For-Tat. Bendor, Kramer & Stout (1991) Is then the evolution of cooperation inevitable?

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 3 / 29

SLIDE 4

Objectives

How does bounded rationality impact the evolution of cooperation? How do different levels of errors impact the automata? What characteristics do the automata exhibit under these different error-levels? What are the prevailing (surviving) automata under these different error-levels? What automaton would you pick to play this game?

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 4 / 29

SLIDE 5

Results

1 The evolution of cooperation becomes less likely as implementation

and perception errors become more likely.

2 The study identifies a threshold error-level.

At and above the threshold error-level, the prevailing structures converge to the open-loop automaton Always-Defect. Below the threshold, the prevailing automata are closed-loop and diverse.

3 Prevailing automata tend to be less complex, exhibit low reciprocal

cooperation and low tolerance to defections as the likelihood of errors increases.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 5 / 29

SLIDE 6

Thought Experiment

30 agents are to play the PD game. Agents initially, randomly choose a strategy and play the game against each other in a round-robin structure. With the completion of all round-matches, the strategies and scores become common knowledge. Based on this information, each agent is allowed to adjust her strategy for the next generation. Agents choose their new strategies, and a new generation is initiated.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 6 / 29

SLIDE 7

Moore Machines

A Moore machine for player i in an infinitely repeated game G, is a four-tuple (Qi, qi

0, f i, τi) where

Qi is a finite set of internal states, qi

0 is specified to be the initial internal state,

f i : Qi → Ai is an output function that assigns an action to every state, τi : Qi × A−i → Qi is the transition function that assigns a state to every pair of a state and the opponent’s action (A−i).

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 7 / 29

SLIDE 8

Example

C C,D D start C D

Figure: Grim-Trigger Machine

Qi = {qC, qD} qi

0 = qC

f i(qC) = C and f i(qD) = D τi(q, a−i) = {qC

(q,a−i)=(qC ,C) qD

therwise

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 8 / 29

SLIDE 9

Errors

Implementation Errors

The machine of agent i in the PD game commits an implementation error with probability ǫ, when for any given state q, the machine′s

utput function returns the action f i(q) with probability 1 − ǫ and draws

another action “f i(q)” where f i(q)=“f i(q)” otherwise.

Perception Errors

The machine of agent i in the PD game commits a perception error with probability δ, when for any given opponent′s action a−i, the machine inputs the opponent′s action a−i into the transition function with probability 1 − δ and inputs the opponent′s action “a−i” into the transition function where a−i=“a−i” otherwise.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 9 / 29

SLIDE 10

The Genetic Algorithm

The GAs operate on 3 fundamental principles. The algorithms require:

A coding of the parameter set.

The assignment of a measure of performance on each machine.

The imposition of genetic operators onto the machines.

The selection-dynamics reflect the limited ability of the agents to receive, decode and act upon the information they get in the course

f the evolution.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 10 / 29

SLIDE 11

Coding of Strings

A Moore machine is defined by a string of 25 elements.

C D D C start C D

Figure: Tit-For-Tat Machine

start

1 0 1

state 0

0 0 1

state 1

0 0 0

state 2

0 0 0

state 3

0 0 0

state 4

0 0 0

state 5

0 0 0

state 6

0 0 0

state 7

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 11 / 29

SLIDE 12

The (Pseudo)code

Specify error-level Fix max-periods = 200 Create initial population: 30 agents (seed randomly) Initiate round-robin tournament For t = 1 to 500 do For all agent-pairs do For p = 1 to max-periods do Award utils to each agent based on the PD matrix End loop Output performance score End loop Apply subroutine for the offspring-population-creation Store agent results End loop

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 12 / 29

SLIDE 13

Subroutine for the Creation of the Offspring Population

Sort agents based on performance score Copy top 20 agents to offspring-population Select 10 agent-pairs via probabilities biased by performance scores For each of 10 pairs do Create new agent as a copy of the winner of the pair’s match Mutate new agent by switching one element at random End loop

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 13 / 29

SLIDE 14

Time-Homogeneous Markov Chains

A population is inhabited by clones playing the PD game. Consider a system that in time-step n can be in one of four possible states in state space S = (s1, s2, s3, s4). Let a strategy have transition rule p = (p1, p2, p3, p4) where 0 ≤ pi ≤ 1 denotes the probability of cooperating after the corresponding outcome of the previous period. si is 1 if the strategy plays Cooperate, and 0 if the strategy plays Defect after outcome i (i = 1, 2, 3, 4).

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 14 / 29

SLIDE 15

Transition Rule for STFT & SALLD with Errors

Let ǫ be the probability of committing an implementation error. Let δ be the probability of committing a perception error. STFT: (1, 0, 1, 0) → (1 − δ − ǫ(1 − 2δ), δ + ǫ(1 − 2δ), 1 − δ − ǫ(1 − 2δ), δ + ǫ(1 − 2δ)) SALLD: (0, 0, 0, 0) → (ǫ, ǫ, ǫ, ǫ)

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 15 / 29

SLIDE 16

Transition Matrix

Rule p = (p1, p2, p3, p4) matched against rule q = (q1, q2, q3, q4) yields a Markov process where the transition probabilities between the four possible states are given by the following matrix:     p1q1 p1(1 − q1) (1 − p1)q1 (1 − p1)(1 − q1) p2q3 p2(1 − q3) (1 − p2)q3 (1 − p2)(1 − q3) p3q2 p3(1 − q2) (1 − p3)q2 (1 − p3)(1 − q2) p4q4 p4(1 − q4) (1 − p4)q4 (1 − p4)(1 − q4)     The system has time-homogeneous transition probabilities and the Markov property.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 16 / 29

SLIDE 17

Stationary Distribution

Since p and q are in the interior of the cube, all the entries of the matrix are strictly positive; hence there exists a unique stationary distribution π = (π1, π2, π3, π4) for n → ∞. The payoff for a player i using p against a player −i using q is given by: A(p, q) = 3π1 + 5π3 + π4 (1) Assuming that implementation errors and perception errors are each kept constant at 4%, the invariant distributions of Tit-For-Tat and Always-Defect yield: A(STFT, STFT) = 2.25 A(SALLD, SALLD) = 1.12

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 17 / 29

SLIDE 18

Computational Treatments

Treatment @ 4%

The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 4%.

Treatment @ 2%

The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 2%.

Treatment @ 1%

The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 1%.

Treatment @ 0%

The likelihood of committing an implementation and a perception error is kept constant throughout the evolution at 0%.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 18 / 29

SLIDE 19

Evolution of Payoffs

1 The incorporation of errors is sufficient to alter the evolution of

cooperative outcomes.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 19 / 29

SLIDE 20

Prevailing Machines

The open-loop machine Always-Defect was the clear winner in the treatments at 4% and 2%.

C,D start

Figure: Always-Defect

The structures that prevailed in the treatments at 1% and 0% were quite diverse and closed-loop.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 20 / 29

SLIDE 21

Summary Measures

A state is accessible if, given the automaton’s starting state, there is some possible combination of the opponent’s possible actions that will result in a transition in that state. The cooperation-reciprocity is the proportion of accessible states that respond to an observed cooperation by the opponent with a cooperation. The defection-reciprocity is the proportion of accessible states that respond to an observed defection by the opponent with a defection.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 21 / 29

SLIDE 22

Accessible States

1 The average number of accessible states is decreasing in the

probability of errors.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 22 / 29

SLIDE 23

Cooperation-Reciprocity

2 Cooperation-reciprocity is decreasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 23 / 29

SLIDE 24

Defection-Reciprocity

3 Defection-reciprocity is increasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 24 / 29

SLIDE 25

Machine Characteristics

Characteristic 1

The average number of accessible states is decreasing in the probability of errors.

Characteristic 2

Cooperation-reciprocity is decreasing in the probability of errors.

Characteristic 3

Defection-reciprocity is increasing in the probability of errors.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 25 / 29

SLIDE 26

Example 1

Rubik’s cube has 43 trillion possible initial positions. Minimizing the number of moves to solve the cube would require an extremely complex pattern of adjustment from one particular scrambled position to another. Cube experts have developed rigidly-structured-solving procedures that employ a small repertoire of solving patterns to unscramble the cube.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 26 / 29

SLIDE 27

Example 2

This was the first book on card-counting. The book emphasized sophisticated card-counting and bet-variation methods. Later books steadily evolved towards more rigidly-structured methods.

No Need to Count (1980) Winning Casino Blackjack for the Non-Counter (1981)

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 27 / 29

SLIDE 28

Revisiting Bendor, Kramer & Stout (1991)

The success of NAF is limited to the particular ecology. NAF’s generocity lacks generalizability.

Difference-aversion models Reciprocity models Competitive models

Laboratory research has identified several factors that might diminish generosity among human strategists.

Dal Bo & Frechette (2011) - “... these results cast doubt on the common assumption that agents will make the most of the opportunity to cooperate whenever it is possible to do so in equilibrium.”

Summary measures of the present study point to a very different direction.

The evolving machines exhibit readiness to sneak on the opponent. The evolving machines are relentless punishers of defections.

Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 28 / 29

SLIDE 29

Summary Results

1 The evolution of cooperation becomes less likely as implementation

and perception errors become more likely.

2 The study identifies a threshold error-level.

At and above the threshold error-level, the prevailing structures converge to the open-loop automaton Always-Defect. Below the threshold, the prevailing automata are closed-loop and diverse.

3 Prevailing machines tend to be less complex as the likelihood of errors

increases.

4 Cooperation-reciprocity is decreasing in the probability of errors. 5 Defection-reciprocity is increasing in the probability of errors. Christos A. Ioannou (Southampton) Bounded Rationality in Finite Automata February 18, 2011 29 / 29