peer pressure as a driver of adaptation in agent societies
play

Peer Pressure as a Driver of Adaptation in Agent Societies Hugo Carr - PowerPoint PPT Presentation

Peer Pressure as a Driver of Adaptation in Agent Societies Hugo Carr 1 , Jeremy Pitt 1 and Alexander Artikis 21 1 Imperial College London 2 National Centre for Scientic Research Demokritos { h.carr,j.pitt } @imperial.ac.uk,


  1. Peer Pressure as a Driver of Adaptation in Agent Societies Hugo Carr 1 , Jeremy Pitt 1 and Alexander Artikis 21 1 Imperial College London 2 National Centre for Scientic Research “Demokritos” { h.carr,j.pitt } @imperial.ac.uk, a.artikis@iit.demokritos.gr ESAW 2008, St Etienne, France, Sep 2008 Thanks to: UK EPSRC EU FP6 Project 027958 ALIS Peer Pressure . . . 1

  2. Background • Characteristics of networks – open: agents are heterogeneous, may be competing, conflicting goals – fault-tolerant: agents may not conform to the system specification – volatile-tolerant: agents may come/go, join/leave the system – decentralised: there is no central control mechanism – partial: local knowledge, (possibly) inconsistent global union • Agent Societies – Accountable governance, market economy, Rule of Law – Mutable: “tomorrow can be different from today” – Socio-cognitive relations: trust/forgiveness, gossiping Peer Pressure . . . 2

  3. Motivation • Resource allocation scenario where not all requirements can be satisfied – Common feature of e.g. ad hoc networks • Two options: – Free for all: short-term gain, long-term annihilation – Do what people do: form committee, make up rules, . . . • Previous work (OAMAS08) – Allocation according to vote, change the voting rules – Showed: population of ‘responsible’ agents stabilised the system – Now: given a stable system, show resistance to ‘selfish’ behaviour – Moreover: given a choice (responsible/selfish), agents ‘choose’ responsible (or have it chosen for them...) Peer Pressure . . . 3

  4. How you gonna do that? • Voting – voting about the rule – voting for each other • Learning (individual behaviour) • Reputation (individual opinion formation) • Show that Organised Adaptation – is stable – is robust Peer Pressure . . . 4

  5. Formal Model • Let M be a multi-agent system (MAS) at time t M t = � U, � A, ρ, B, f , τ � t � – U = the set of agents – A t ⊆ U , the set of present agents at t – ρ t : U → { 0 , 1 } , the presence function s.t. ρ t ( a ) = 1 ↔ a ∈ A t – B t : Z , the ‘bank’, indicating the overall system resources available – τ t : N , the threshold number of votes to be allocated resources – f t : A t → N 0 The resource allocation function f t determines who gets allocated resources according to the value of τ t and the votes cast (see below) Peer Pressure . . . 5

  6. Scenario • System operation is divided into timeslices; during each timeslice, each ‘present’ agent a will – Phase 1: Vote for threshold value for τ (change a rule) – Phase 2: Offer ( O a )/Request ( R a ) resources ( R a > O a ) – Phase 3: Vote for a candidate(s) to receive resources – Phase 4: Update its satisfaction and learning metrics with respect to the outcome of the vote Peer Pressure . . . 6

  7. Phase 1: Voting for τ • Tau ( τ ) represents the threshold number of votes required to receive resources (at time t ) R a t , card ( { b | b ∈ A t ∧ v b f t ( a ) = t ( . . . ) = a } ) ≥ τ t = 0 , otherwise • The value of τ is context dependent and crucial for ‘collective well-being’ – If τ is too low, too many resources will be distributed, and this will result in the “Tragedy of the Commons” – If τ is too high, too few resources will be distributed, and this will result in “Voting with your Feet” (satisfaction) • Each timeslice t , two-round election – round 1: each present agent proposes a value for τ – round 2: run-off election between two most popular selections Peer Pressure . . . 7

  8. Phase 2: Reputation Management • Vote for τ is an indicator of selfish/responsible behaviour • For experimentation, require a method that computes τ ‘responsibly’, supports discrimination, and isn’t random – define a family of predictor functions, randomly initialised, a subset of which is given to each agents – functions which return ‘good’ value have increased weight j x i � w i = pred τ = w i .a i � ∀ j x j i =0 • Agent uses other agents’ τ -voting to update opinion of those agents Peer Pressure . . . 8

  9. Phase 3: Voting to Allocate Resources • Plurality Protocol in ineffective – Does not provide information to effectively judge selfish or responsible behaviour – Punishment in the form of lost votes is not sufficient motivation to behave responsibly • Borda Protocol – Agents vote using preference lists derived from reputation score – Points are allocated based on ‘most preferred’ – Agents are forced to give their opinion of their neighbours ∗ Allows a participant to see more easily who is behaving responsibly or selfishly Peer Pressure . . . 9

  10. Phase 4: Reinforcement Learning • Used to demonstrate how an initially selfish agent can be ‘rehabilitated’ through peer pressure • Unbiased evaluation of sets of actions • A Q-Value is a metric which measures from a history of length m how successful an action x has been in a certain state s when each action is assigned a reward r m Q t +1 ( s, x ) = 1 � ( r k i + γV k i ( s k i )) + ǫ m i =1 where V t = max x ∈ X Q t ( s, x ) , r k ∈ [0 , 1] , γ ∈ [0 , 1] Peer Pressure . . . 10

  11. Experiment • Initially we show that this experiment is stable amongst a group (size 10) of these agents who have already established a stable system • We then add a destabilising element to the system at timecycle 3000 consisting of a set of agents (size 5) behaving selfishly – Agents who learn to behave responsibly are forgiven and assimilated into society – Agents who fail to learn are permanently ostracised and leave the system (through dissatisfaction) • Use a certain ‘well-known’ MAS animator PreSAGE Peer Pressure . . . 11

  12. Results (1.1): Satisfaction for Responsible Agents Graph of Agent Satisfaction 1 0.9 0.8 0.7 0.6 Satisfaction 0.5 0.4 0.3 0.2 Satisfaction of Responsible Population (10 Agents) 0.1 Satisfaction of Initially Selfish Population which turned Responsible (5 Agents) 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Simulation Timeslice Peer Pressure . . . 12

  13. Results (1.2): Q-Values for Responsible Agents 1 0.9 0.8 0.7 0.6 Q Values 0.5 0.4 0.3 Average Responsible metric for the main responsible 0.2 population Responsible Q Value estimate for selfish agents who are learning 0.1 Selfish Q Value estimate for selfish agents who are learning 0 0 1000 2000 3000 4000 5000 6000 Simulation Cycle Peer Pressure . . . 13

  14. Results (2.1): Satisfaction for a Selfish Agent 1 0.9 0.8 0.7 0.6 Satisfaction 0.5 0.4 0.3 Satisfaction of the main population of responsible 0.2 agents Satisfaction of agent who initially selfish, did 0.1 not learn to behave responsibly 0 0 1000 2000 3000 4000 5000 6000 Simulation Cycle Peer Pressure . . . 14

  15. Results (2.2): Q-Values for a Selfish Agent 1 0.9 0.8 0.7 0.6 Q Values 0.5 0.4 0.3 Responsible Q Value estimate for agent13 0.2 Selfish Q Value estimate for agent13 Average Responsible metric 0.1 for the main responsible population 0 0 1000 2000 3000 4000 5000 6000 Simulation Cycle Peer Pressure . . . 15

  16. Summary (and duck) • Additional supporting evidence for Axelrod’s study of emergent norms • Organised adaptation: – the introspective application of soft-wired local computations, with respect to physical rules, the environment and conventional rules, in order to achieve intended and coordinated global outcomes • as opposed to • Emergent adaptation: – the non-introspective application of hard-wired local computations, with respect to physical rules and/or the environment, which achieve unintended or unknown global outcomes Peer Pressure . . . 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend