Models of language Evolution Session 09: Evolution of Pragmatic - - PowerPoint PPT Presentation

models of language evolution
SMART_READER_LITE
LIVE PREVIEW

Models of language Evolution Session 09: Evolution of Pragmatic - - PowerPoint PPT Presentation

S IGNALING G AME B EHAVIOURAL S TRATEGIES & U PDATE D YNAMICS M ODELING PRAGMATIC PHENOMENA Models of language Evolution Session 09: Evolution of Pragmatic Strategies Roland M uhlenbernd University of T ubingen S IGNALING G AME B


slide-1
SLIDE 1

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

Models of language Evolution

Session 09: Evolution of Pragmatic Strategies Roland M¨ uhlenbernd University of T¨ ubingen

slide-2
SLIDE 2

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAME: DEFINITION

A signaling game is a tuple SG = {S, R}, T, Pr, M, A, US, UR with

◮ {S, R}: set of players ◮ T: set of states ◮ Pr: prior beliefs: Pr ∈ ∆(T) ◮ M: set of messages ◮ A: set of receiver actions ◮ US,R: utility function: T × M × A → R

slide-3
SLIDE 3

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAME: EXAMPLE

A standard Lewis game is defined as:

◮ Set of players {S, R} ◮ Set of states T = {t1, t2} ◮ Equiprobable prior beliefs: Pr(t1) = .5, Pr(t2) = .5 ◮ Set of messages M = {m1, m2} (no costs) ◮ Set of actions A = {a1, a2} ◮ Utility function US,R(ti, mj, ak) =

1 if i = k else a1 a2 t1 1,1 0,0 t2 0,0 1,1

slide-4
SLIDE 4

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

STRATEGIES

The players’ ”actions” can be represented as pure strategies. For the Lewis game there are 4 strategies for each player: S1:

t1 m1 t2 m2

S2:

t1 m2 t2 m1

S3:

t1 m1 t2 m2

S4:

t1 m2 t2 m1

R1:

m1 a1 m2 a2

R2:

m1 a2 m2 a1

R3:

m1 a1 m2 a2

R4:

m1 a2 m2 a1

slide-5
SLIDE 5

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

EXPECTED UTILITIES

The expected utility for a combination of strategies is given as: EU(Si, Rj) =

  • t∈T

Pr(t) × U(t, Si(t), Rj(Si(t))) (1) R1 R2 R3 R4 S1 1 .5 .5 S2 1 .5 .5 S3 .5 .5 .5 .5 S4 .5 .5 .5 .5

Table: Expected utilities for all strategy combinations of the Lewis game

slide-6
SLIDE 6

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAMES AS STATIC GAMES

◮ static games: agents choose simultaneously ◮ SG as static game: agents choose strategies ◮ a strategy represents a ”contingency plan”: what would an

agent do in each state

slide-7
SLIDE 7

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAMES AS STATIC GAMES

Extensions:

  • 1. multi-agent system

◮ graph G = N, V as interaction structure ◮ nodes represent agents, edges represent connections for

interaction

◮ different structures: grid, small world...

  • 2. update rules

◮ imitate the majority ◮ imitate the best ◮ conditional imitation ◮ best response ◮ against mixed strategy over all neighbours ◮ against mixed strategy over preplayed rounds (fictitious

play)

slide-8
SLIDE 8

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAMES AS DYNAMIC GAMES

◮ dynamic games: agents choose in sequence ◮ SG as dynamic game: agents play behavioural strategies ◮ a behavioural strategy represents a behaviour: what would

an agent do for a given situation

slide-9
SLIDE 9

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

BEHAVIOURAL STRATEGIES

Behavioural strategies are functions that map choice points to probability distributions over actions available in that choice point.

◮ behavioural sender strategy

σ ∈ S = (∆(M))T

◮ behavioural receiver strategy

ρ ∈ R = (∆(A))M σ =     t1 → m1 → .9 m2 → .1

  • t2 →

m1 → .5 m2 → .5

   ρ =     m1 → a1 → .33 a2 → .67

  • m2 →

a1 → 1 a2 →

  

slide-10
SLIDE 10

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIGNALING GAMES AS DYNAMIC GAMES

Extensions:

  • 1. multi-agent system

◮ graph G = N, V as interaction structure ◮ nodes represent agents, edges represent connections for

interaction

◮ different structures: grid, small world...

  • 2. update rules

◮ imitate the majority ◮ imitate the best ◮ conditional imitation ◮ best response ?? ◮ against mixed strategy over all neighbours ◮ against mixed strategy over preplayed rounds (fictitious

play)

slide-11
SLIDE 11

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

BEHAVIOURAL STRATEGIES

◮ Behavioural strategies represent probabilistic behaviour

Example: σ(m1|t2) = .5 - for state t2 the sender sends message m1 with a probability of .5

◮ Behavioural strategies represent beliefs

Example: ρ(a1|m1) = .33 - the sender believes that the receiver construes message m1 with a1 with a probability of .33

σ =     t1 → m1 → .9 m2 → .1

  • t2 →

m1 → .5 m2 → .5

   ρ =     m1 → a1 → .33 a2 → .67

  • m2 →

a1 → 1 a2 →

  

slide-12
SLIDE 12

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

EXPECTED UTILITY & BEST RESPONSE

◮ While for static SG the Expected utility EU(Si, Rj) was

defined for a strategy pair Si, Rj, for a dynamic SG it is defined in the following way: EUS(m|t, ρ) =

  • a∈A

ρ(a|m) × U(t, m, a) (2) EUR(a|m, σ) =

  • t∈T

σ(m|t) × U(t, m, a) (3)

◮ The behavioural strategies σ and ρ represent beliefs about

the participant

◮ Just as for static games to play Best Response means to

maximize Expected utility

slide-13
SLIDE 13

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

BELIEF LEARNING

◮ Behavioural strategies represent beliefs about the

interlocutor

◮ The belief is a result of observation ◮ Example:

SO a1 a2 m1 8 2 m2 7 13 ρ =     m1 → a1 → .8 a2 → .2

  • m2 →

a1 → .35 a2 → .65

   RO t1 t2 m1 6 m2 4 4 σ =     t1 → m1 → .6 m2 → .4

  • t2 →

m1 → m2 → 1

  

slide-14
SLIDE 14

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

BELIEF LEARNING & BEST RESPONSE

◮ After a played game both interlocutors can observe the

resulting game path and update their beliefs accordingly

◮ Example:

◮ Given the following observations and appropriate beliefs:

SO a1 a2 m1 8+1 2 m2 7 13 RO t1 t2 m1 6+1 m2 4 4

◮ the sender is faced with state t1 ◮ EU(m1|t1, ρ) = .8, EU(m2|t1, ρ) = .35 ◮ m1 maximises EU, thus sending m1 is best response ◮ the receiver has to construe message m1: ◮ EU(a1|m1, σ) = .6, EU(a2|m1, σ) = 0 ◮ a1 maximises EU, thus playing a1 is best response ◮ players observe resulting game path t1, m1, a1 and update

  • bservation counts and beliefs accordingly
slide-15
SLIDE 15

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

EXAMPLE: RESULT IN A SW NETWORK

Figure: Resulting structure after 30 simulation steps of 100 BL agents playing the Lewis game on a SW

  • network. The colours blue and green represent both signaling systems as target strategies.
slide-16
SLIDE 16

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

REINFORCEMENT LEARNING

For a given signaling game SG

◮ the sender has a urn ℧t for each t ∈ T filled with balls of a

type m ∈ M

◮ the receiver has a urn ℧m for each m ∈ M filled with balls

  • f a type a ∈ A

◮ Example:

m1 m2 ℧t1 8 2 ℧t2 7 13 a1 a2 ℧m1 2 3 ℧m2 7 1

slide-17
SLIDE 17

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

REINFORCEMENT LEARNING

◮ If the sender is faced with state t she draws a ball from urn

℧t and sends the message appropriate to the ball type m.

◮ If the receiver receives message m he draws a ball from urn

℧m and plays the action appropriate to the ball type a.

◮ Behavioural strategies represent probabilistic behaviour ◮ After a played round successful communication will be

reinforced σ =     t1 → m1 → .8 m2 → .2

  • t2 →

m1 → .35 m2 → .65

   ρ =     m1 → a1 → .4 a2 → .6

  • m2 →

a1 → .875 a2 → .125

  

slide-18
SLIDE 18

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

REINFORCEMENT LEARNING

Example: Given the following urn settings: m1 m2 ℧t1 8 2+1 ℧t2 7 13 a1 a2 ℧m1 2 3 ℧m2 7+1 1 Play 1:

◮ the sender is faced with t1 ◮ she draws ball type m2

from urn ℧t1

◮ the receiver has to

construe m2

◮ he draws ball type a1 from

urn ℧m2

◮ communication per

t1, m2, a1 is successful: reinforcement

Play 2:

◮ the sender is faced with t2 ◮ she draws ball type m1

from urn ℧t2

◮ the receiver has to

construe m1

◮ he draws ball type a1 from

urn ℧m1

◮ communication per

t2, m1, a1 isn’t successful: no reinforcement

slide-19
SLIDE 19

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

REINFORCEMENT LEARNING

Possible extensions:

◮ negative reinforcement: decrease the number of

appropriate balls if communication is not successful

◮ lateral inhibition: for successful communication not only

increase the number of appropriate balls, but also decrease the number of all other balls in the same urn

◮ limited memory: consider only the last n observations

slide-20
SLIDE 20

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

EXAMPLE: RESULT IN A SW NETWORK

Figure: Resulting structure after 300 simulation steps of 100 RL agents playing the Lewis game (with lateral

inhibition) on a SW network. The colours blue and green represent both signaling systems as target strategies.

slide-21
SLIDE 21

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

BELIEF LEARNING VS. REINFORCEMENT LEARNING

behavioural rational learning speed BL + BR √ √ fast RL √

  • slow
slide-22
SLIDE 22

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

NEO-GRICEAN PRAGMATICS

◮ the Conversational Implicature is a pragmatic phenomenon

where an utterance’s intended meaning differs from its literal meaning.

◮ Interlocutors can resolve the difference between the

intended pragmatic interpretation (PI) and the literal interpretation (LI) by Cooperation Principles. Levinson (2000) subdivided GCI’s in:

◮ Q-Implicature ◮ I-Implicature ◮ M-Implicature

slide-23
SLIDE 23

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

Q-IMPLICATURE

(1) ”Some boys came to the party.” LI: Some, maybe all boys came. ∃ = ∃¬∀ ∨ ∀ PI: Some but not all boys came. ∃¬∀ Strategy for LI t∀ t∃¬∀ mall msome msbna a∀ a∃¬∀ Strategy for PI t∀ t∃¬∀ mall msome msbna a∀ a∃¬∀

slide-24
SLIDE 24

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

MODELLING Q-IMPLICATURE

Parameter settings:

◮ T = {t∀, t∃¬∀} ◮ M = {mall, msome, msbna} ◮ A = {a∀, a∃¬∀} ◮ Pr(t∀) = Pr(t∃¬∀) = .5 ◮ κ(msbna) = 1

κ(mall) = κ(msome) > 1

◮ Initial LI strategy

t∀ t∃¬∀ mall msome msbna t∀ t∃¬∀

.5 .5 .5 .5 .5 .5

mall msome msbna ℧t∀ 50 50 ℧t∃¬∀ 50 50 a∀ a∃¬∀ ℧mall 100 ℧msome 50 50 ℧msbna 100

slide-25
SLIDE 25

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIMULATION & RESULTS

◮ 200 RL agents play the Q-Implicature game repeatedly on

a total network with random partners

◮ all agents start with the initial urn setting that represents LI ◮ The simulation ends if all agents have learned a pure

strategy Results: t∀ t∃¬∀ mall msome msbna t∀ t∃¬∀ t∀ t∃¬∀ mall msome msbna t∀ t∃¬∀

% of agents 1 2 3 4 5 κ(msome),κ(mall)

slide-26
SLIDE 26

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

I-IMPLICATURE

”What is expressed simply is stereotypically exemplified” (2) ”Billy drank a glass of milk.” LI: A glass of any kind of milk. tc, tg PI: A glass of cow’s milk. tc Strategy for LI tc tg mcm mm mgm ac ag Strategy for PI tc tg mcm mm mgm ac ag

slide-27
SLIDE 27

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

MODELLING I-IMPLICATURE

Parameter settings:

◮ T = {tc, tg} ◮ M = {mm, mcm, mgm} ◮ A = {ac, ag} ◮ Pr(tc) = .8 > Pr(tg) = .2 ◮ κ(mm) = 2

κ(mcm) = κ(mgm) = 1

◮ Initial LI strategy

tc tg mcm mm mgm ac ag

.5 .5 1 − p p 1 − p p

mcm mm mgm ℧tc 100 − n n ℧tg n 100 − n for n = ⌊100 × p⌋ ac ag ℧mcm 100 ℧mm 50 50 ℧mgm 100

slide-28
SLIDE 28

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIMULATION & RESULTS

◮ 200 RL agents play the Q-Implicature game repeatedly on

a total network with random partners

◮ all agents start with the initial urn setting that represents LI ◮ The simulation ends if all agents have learned a pure

strategy Results: tc tg mcm mm mgm tc tg tc tg mcm mm mgm tc tg

% of agents .3 .35 .4 .45 .5 .55 .6 p

slide-29
SLIDE 29

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

M-IMPLICATURE

”What’s said in an abnormal way isn’t normal.” (3) ”Billy caused the sheriff to die.” LI: Billy killed the sheriff in any way. tp, tr PI: Billy killed the sheriff in an abnormal way. tr Strategy for LI tp tr mk mctd ap ar Strategy for PI tp tr mk mctd ap ar

slide-30
SLIDE 30

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

MODELLING THE M-IMPLICATURE

Parameter settings:

◮ T = {tp, tr} ◮ M = {mk, mctd} ◮ A = {ap, ar} ◮ κ(mk) = 2, κ(mctd) = 1 ◮ Pr(tp) > Pr(tr) ◮ Initial LI strategy

tp tr mk mctd ap ar mk mctd ℧tp 50 50 ℧tr 50 50 ap ar ℧mk 50 50 ℧mctd 50 50

slide-31
SLIDE 31

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SIMULATION & RESULTS

◮ 200 RL agents play the Q-Implicature game repeatedly on

a total network with random partners

◮ all agents start with the initial urn setting that represents LI ◮ The simulation ends if all agents have learned a pure

strategy Results: tp tr mk mctd ap ar tp tr mk mctd ap ar

% of agents .51 .52 .53 .54 .55 .56 .57 Pr(tp)

slide-32
SLIDE 32

SIGNALING GAME BEHAVIOURAL STRATEGIES & UPDATE DYNAMICS MODELING PRAGMATIC PHENOMENA

SUMMARY

◮ the difference between

◮ a static SG (agents play pure strategies) ◮ a dynamic SG (agents play behavioural strategies)

◮ update dynamics for dynamic signaling games

◮ belief learning + best response ◮ reinforcement learning

◮ signaling games for neo-gricean implicature types