agent based systems
play

Agent-Based Systems Partial global planning: achieving a global view - PowerPoint PPT Presentation

Agent-Based Systems Agent-Based Systems Where are we? Last time . . . Coordination: managing interactions effectively Different methods for coordination Agent-Based Systems Partial global planning: achieving a global view through


  1. Agent-Based Systems Agent-Based Systems Where are we? Last time . . . • Coordination: managing interactions effectively • Different methods for coordination Agent-Based Systems • Partial global planning: achieving a global view through information exchange Michael Rovatsos • Joint intentions: extending the BDI paradigm to include joint mrovatso@inf.ed.ac.uk intentions, collective commitments and conventions • Mutual modelling: taking the role of the other to predict their actions • Norms and social laws: coordination through offline/emergent Lecture 8 – Multiagent Interactions constraints on agent behaviour • Multiagent planning and synchronisation, plan merging Today . . . • Multiagent Interactions 1 / 18 2 / 18 Agent-Based Systems Agent-Based Systems Multiagent interactions Preferences and utilities • We have looked at agent communication, but not described how it • We first need an abstract model of interactions is used in actual agent interactions • Assume O = { o 1 , . . . o n } a set of possible outcomes (e.g. possible • In itself, communication does not have much effect on the agents “runs” of the system until final states are reached) • Now, we are going to look at interactions in which agents affect • A preference ordering ≻ i ⊆ O × O for agent i is a total, each other through their actions antisymmetric, transitive relation on O , i.e. • Assume agents to have “spheres of influence” that they control in • o ≻ i o ′ ⇒ o ′ �≻ i o • o ≻ i o ′ ∧ o ′ ≻ o ′′ ⇒ o ≻ i o ′′ the environment • ∀ o , o ′ ∈ O either o ≻ i o ′ or o ′ ≻ i o • Also, we assume that the welfare (goal achievement, utility) of each agent at least partially depends on the actions of others • Such an ordering can be used to express strict preferences of an agent over O (write � i if also reflexive, i.e. o � i o ) • This part of the lecture will deal with what agents should do in the presence of other agents (which also do stuff) 3 / 18 4 / 18

  2. Agent-Based Systems Agent-Based Systems Preferences and utilities Preferences and utilities • The utility of money: • Preferences are often expressed through a utility function u i : O → R : u i ( o ) > u i ( o ′ ) ⇔ o ≻ o ′ , u i ( o ) ≥ u i ( o ′ ) ⇔ o � o ′ • Utilities make representing preferences easier because the ordering follows naturally if we use real numbers • Often, people falsely associate utility directly with money! • Intuitively, the utility of money depends on how much money one • Empirical evidence suggests utility of money is often very close to already has logarithm function for humans • Therefore, utility does not increase proportionally with monetary • This shows that utility function depends on agent’s risk aversion wealth attitude (value of additional utility depending on current “wealth”) 5 / 18 6 / 18 Agent-Based Systems Agent-Based Systems Multiagent encounters Example: The Prisoner’s Dilemma • Applying the above to a multiagent setting, we need to consider several agents’ actions and the outcomes they lead to • Two men are collectively charged with a crime and held in separate cells, with no way of meeting or communicating. They are told that: • For now, restrict ourselves to two players and identical sets of • if one confesses and the other does not, the confessor will be freed, actions and the other will be jailed for three years; • Abstract architecture: state transformer function becomes • if both confess, then each will be jailed for two years. τ : Ac × Ac → O Both prisoners know that if neither confesses, then they will each be jailed for one year. where Ac are the actions of each of the two agents • Payoff matrix for this game: • Outcome depends on other’s actions! • For pairs ( a 1 , a 2 ) , ( a ′ 1 , a ′ 2 ) ∈ Ac × Ac we can write 2 C D 1 ( a 1 , a 2 ) � ( a ′ 1 , a ′ 2 ) iff τ ( a 1 , a 2 ) � τ ( a ′ 1 , a ′ 2 ) C (3,3) (0,5) (similarly for ≻ and utilities u 1 / 2 ( τ ( a 1 , a 2 )) ) D (5,0) (1,1) • We consider agents to be rational if they prefer actions that lead to preferred outcomes 7 / 18 8 / 18

  3. Agent-Based Systems Agent-Based Systems Game theory Dominance and Best Response Strategies • Mathematical study of interaction problems of this sort • Two simple and very common criteria for rational decision making • Basic model: agents perform simultaneous actions (potentially over in games • Strategy s ∈ S i is said to dominate s ′ ∈ S i iff several stages), the actual outcome depends on the combination of action chosen by all agents u i ( s , s − i ) ≥ u i ( s ′ , s − i ) • Normal-form games : final result reached in single step (in ∀ s − i ∈ S − i contrast to extensive-form games ) ( s − i = ( s 1 , . . . , s i − 1 , s i + 1 , . . . , s n ) , same abbrev. used for S ) • Agents { 1 , . . . , n } , S i =set of (pure) strategies for agent i , S = × n i = 1 S i space of joint strategies • Dominated strategies can be safely deleted from the set of • Utility functions u i : S → R map joint strategies to utilities strategies, a rational agent will never play them • A probability distribution σ i : S i → [ 0 , 1 ] is called a mixed strategy • Some games are solvable in dominant strategy equilibrium , of agent i (can be extended to joint strategies) i.e. all agents have a single (pure/mixed) strategy that dominates • Game theory is concerned with the study of this kind of games (in all other strategies particular developing solution concepts for games) 9 / 18 10 / 18 Agent-Based Systems Agent-Based Systems Dominance and Best Response Strategies Nash Equilibrium • Nash (1951) defined the most famous equilibrium concept for • Strategy s ∈ S i is a best response to strategies s − i ∈ S − i iff normal-form games • A joint strategy s ∈ S is said to be in (pure-strategy) Nash ∀ s ′ ∈ S i , s ′ � = s u i ( s , s − i ) ≥ u i ( s ′ , s − i ) equilibrium (NE), iff • Weaker notion, only considers optimal reaction to a specific ∀ i ∈ { 1 , . . . n }∀ s ′ u i ( s i , s − i ) ≥ u i ( s ′ i ∈ S i i , s − i ) behaviour of other agents • Intuitively, this means that no agent has an incentive to deviate • Unlike dominant strategies, best-response strategies (trivially) from this strategy combination always exist • Very appealing notion, because it can be shown that a • Strict versions of the above relations require that “ > ” holds‘ for at (mixed-strategy) NE always exists least one s ′ • But also some problems: • Replace s i / s − i above by σ i / σ − i and you can extend the definitions • Not always unique, how to agree on one of them? for dominant/best-response strategies to mixed strategies • Proof of existence does not provide method to actually find it • Many games do not have pure-strategy NE 11 / 18 12 / 18

  4. Agent-Based Systems Agent-Based Systems Example Example The Prisoner’s Dilemma: Nash equilibrium is not Pareto efficient (or: no one will dare to cooperate although mutual cooperation is preferred over The Coordination Game: No temptation to defect, but two equilibria mutual defection) (hard to know which one will be chosen by other party) 2 C D 2 A B 1 1 C (3,3) (0,5) A (1,1) (-1,-1) D (5,0) (1,1) B (-1,-1) (1,1) General conditions on utilities: DC ≻ CC ≻ DD ≻ CD (from first player’s point of view) and u ( CC ) > u ( DC )+ u ( CD ) 2 13 / 18 14 / 18 Agent-Based Systems Agent-Based Systems The Evolution of cooperation? The evolution of cooperation? • In zero-sum/constant-sum games one agent loses what the other • In single-shot PD, defection is the rational solution wins (e.g. Chess) no potential for cooperation • In (infinitely) iterated case, cooperation is the rational choice in the • Typical non-zero sum game : there is a potential for cooperation PD but how should it emerge among self-interested agents? • But not if game has a fixed, known length (“backward induction” • This situation occurs in many real life cases: problem) • Nuclear arms race • Tragedy of the commons • TIT FOR TAT strategy performed best against a variety of • “Free rider” problems strategies (this does not mean it is the best strategy, though!) • Axelrod’s tournament (1984): a very interesting study of such • Axelrod’s conclusions from this: interaction situations • don’t be envious, don’t be the first to defect, reciprocate defection and cooperation (don’t hold grudges), don’t be too clever • Iterated Prisoner’s Dilemma was played among many different strategies (how to play against different opponents?) 15 / 18 16 / 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend