Learning to Cooperate via Conditional Commitments Jobst Heitzig, PIK - - PowerPoint PPT Presentation

learning to cooperate via conditional commitments
SMART_READER_LITE
LIVE PREVIEW

Learning to Cooperate via Conditional Commitments Jobst Heitzig, PIK - - PowerPoint PPT Presentation

Learning to Cooperate via Conditional Commitments Jobst Heitzig, PIK RD4 RD4 seminar 21 April 2020 Overview Motivation, Inspiration, Rationale, Example Theoretical background Formal results Learning & Agent-based modeling


slide-1
SLIDE 1

Learning to Cooperate via Conditional Commitments

Jobst Heitzig, PIK RD4 RD4 seminar 21 April 2020

slide-2
SLIDE 2

Heitzig Learning to Cooperate via Conditional Commitments 2

Overview

  • Motivation, Inspiration, Rationale, Example
  • Theoretical background
  • Formal results
  • Learning & Agent-based modeling
  • Simulation results
slide-3
SLIDE 3

Heitzig Learning to Cooperate via Conditional Commitments 3

Overview

  • Motivation, Inspiration, Rationale, Example
  • Theoretical background
  • Formal results
  • Learning & Agent-based modeling
  • Simulation results
slide-4
SLIDE 4

Heitzig Learning to Cooperate via Conditional Commitments 4

Motivation: International Climate Mitigation

  • GHG reductions are a positive externality → free-riding → need for cooperation
  • How to establish cooperation?
  • negotiate a “grand” treaty (UNFCCC/COP, Kyoto, Paris)
  • slow, not yet very successful, may lead to only unambitious treaties
  • but concept of INDCs contains idea of conditional commitments

ynamically form small then larger coalitions “bottom-up”

  • ngoing process, not yet very successful, but may succeed eventually

(Auer et al. Sci.Rep. 2015; Heitzig & Kornek, NCC 2018)

leads to a hierarchy of bi- or multilateral treaties

unilateral approaches without formal treaties

pioneer unconditionally & hope for followers unilateral conditional but binding commitments

slide-5
SLIDE 5

Heitzig Learning to Cooperate via Conditional Commitments 5

  • GHG reductions are a positive externality → free-riding → need for cooperation
  • How to establish cooperation?
  • negotiate a “grand” treaty (UNFCCC/COP, Kyoto, Paris)
  • slow, not yet very successful, may lead to only unambitious treaties
  • but concept of INDCs contains idea of conditional commitments
  • dynamically form small then larger coalitions “bottom-up”
  • ongoing process, not yet very successful, but may succeed eventually

(Auer et al. Sci.Rep. 2015; Heitzig & Kornek, NCC 2018)

  • leads to a hierarchy of bi- or multilateral treaties

nilateral approaches without formal treaties

pioneer unconditionally & hope for followers unilateral conditional but binding commitments

Motivation: International Climate Mitigation

slide-6
SLIDE 6

Heitzig Learning to Cooperate via Conditional Commitments 6

Motivation: International Climate Mitigation

  • GHG reductions are a positive externality → free-riding → need for cooperation
  • How to establish cooperation?
  • negotiate a “grand” treaty (UNFCCC/COP, Kyoto, Paris)
  • slow, not yet very successful, may lead to only unambitious treaties
  • but concept of INDCs contains idea of conditional commitments
  • dynamically form small then larger coalitions “bottom-up”
  • ongoing process, not yet very successful, but may succeed eventually

(Auer et al. Sci.Rep. 2015; Heitzig & Kornek, NCC 2018)

  • leads to a hierarchy of bi- or multilateral treaties
  • unilateral approaches without formal treaties
  • e.g. some countries pioneer unconditionally & hope for others to follow
  • or: use unilateral but binding, mutually conditional commitments
slide-7
SLIDE 7

Heitzig Learning to Cooperate via Conditional Commitments 7

Inspiration: The NPVIC

Scheme: Agents unilaterally (!) but bindingly commit to behave in certain way if others behave in certain ways. Here: US federal states pass federal state laws Internationally: Countries pass domestic laws?

slide-8
SLIDE 8

Heitzig Learning to Cooperate via Conditional Commitments 8

Rationale

  • Without prior international negotations, a country could pass a domestic law

that requires it to take specifjc climate protection measures as soon as (and as long as) certain other countries have passed similar laws that specify at least a certain amount of certain measures.

  • e.g.: I’ll reduce emissions by 20% if you invest 1% of GDP into the Green Climate Fund

f the ambition is low enough initially, this gives the other country(ies) incentives to indeed pass similar laws. These laws can be adjusted more easily than international treaties to react to circumstances and to increase ambition. At each point in time the set of laws currently in force imply a set of current obligations for all participating countries. Hypothesis: over time, an “effjcient” level of mitigation will arise!

slide-9
SLIDE 9

Heitzig Learning to Cooperate via Conditional Commitments 9

Rationale

  • Without prior international negotations, a country could pass a domestic law

that requires it to take specifjc climate protection measures as soon as (and as long as) certain other countries have passed similar laws that specify at least a certain amount of certain measures.

  • e.g.: I’ll reduce emissions by 20% if you invest 1% of GDP into the Green Climate Fund
  • If the ambition is low enough initially,

this gives the other countries incentives to indeed pass similar laws.

  • At each point in time the set of laws currently in force

imply a set of current obligations for all participating countries. These laws can be adjusted more easily than international treaties to react to circumstances and to increase ambition. Hypothesis: over time, an “effjcient” level of mitigation will arise!

slide-10
SLIDE 10

Heitzig Learning to Cooperate via Conditional Commitments 10

Rationale

  • Without prior international negotations, a country could pass a domestic law

that requires it to take specifjc climate protection measures as soon as (and as long as) certain other countries have passed similar laws that specify at least a certain amount of certain measures.

  • e.g.: I’ll reduce emissions by 20% if you invest 1% of GDP into the Green Climate Fund
  • If the ambition is low enough initially,

this gives the other countries incentives to indeed pass similar laws.

  • At each point in time the set of laws currently in force

imply a set of current obligations for all participating countries.

  • These laws can be adjusted more easily than international treaties to react to

circumstances and to increase ambition.

  • Hypothesis: over time, an “effjcient” level of mitigation will arise!
slide-11
SLIDE 11

Heitzig Learning to Cooperate via Conditional Commitments 11

Example

–20% if USA –10% –5% if China –15% currently unfulfjlled conditions

slide-12
SLIDE 12

Heitzig Learning to Cooperate via Conditional Commitments 12

Example

–20% if USA –10% –20% if USA –10% –10% if EU –20%, –20% if EU&China –30% → –20% → –10% –5% if China –15% –5% if China –15% –25% if Japan neutral currently unfulfjlled conditions bold: currently fulfjlled conditions & resulting obligations

slide-13
SLIDE 13

Heitzig Learning to Cooperate via Conditional Commitments 13

Example

–20% if USA –10% –20% if USA –10% –20% if USA –10%, –40% if USA –20% –10% if EU –20%, –20% if EU&China –30% → –20% → –10% –5% if China –15% –5% if China –15% –5% if China –15% –25% if Japan neutral –25% if Japan neutral neutral –10% if EU –20%, –20% if EU&China –30% → –10% → –20% → –5% → –25% an unconditional commitment

slide-14
SLIDE 14

Heitzig Learning to Cooperate via Conditional Commitments 14

Example

–20% if USA –10% –20% if USA –10% –20% if USA –10%, –40% if USA –20% –10% if EU –20%, –20% if EU&China –30% → –20% → –10% –5% if China –15% –5% if China –15% –5% if China –15% –5% if China –15% –25% if Japan neutral –25% if Japan neutral neutral –10% if EU –20%, –20% if EU&China –30% –50% –10% if EU –20%, –20% if EU&China –30% neutral –20% if USA –10%, –40% if USA –20% → –10% → –20% → –20% → –40% → –5% → –25% → –5%

slide-15
SLIDE 15

Heitzig Learning to Cooperate via Conditional Commitments 15

Overview

  • Motivation, Inspiration, Rationale, Example
  • Theoretical background
  • Formal results
  • Learning & Agent-based modeling
  • Simulation results
slide-16
SLIDE 16

Heitzig Learning to Cooperate via Conditional Commitments 16

Theoretical background

  • Cooperative Game Theory, Effjciency, Bargaining Solutions

→ the “core” of a cooperative game

  • Non-cooperative Game Theory & Forms of Strategic Equilibrium

→ “strong” equilibria of a non-cooperative game

  • The Nash Program & Mechanism Design
slide-17
SLIDE 17

Heitzig Learning to Cooperate via Conditional Commitments 17

Cooperative Game Theory

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point satisficers' outcome Pareto-efficient line 1's indiff. curves through focal point 2's indiff. curves through focal point maximizers' outcome

Example: Cournot duopoly (e.g., two non-OPEC countries reducing output)

“Action space”

slide-18
SLIDE 18

Heitzig Learning to Cooperate via Conditional Commitments 18

Cooperative Game Theory

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point satisficers' outcome Pareto-efficient line 1's indiff. curves through focal point 2's indiff. curves through focal point maximizers' outcome

Example: Cournot duopoly (e.g., two non-OPEC countries reducing output) A combination of actions is …

  • (Pareto-)effjcient: no other combination

gives all players more payoff

  • in the “bargaining set”:

all players get at least what they would get at the disagreement point (here: (0,0))

  • in the “core” of the game: no group can get

more by changing their actions, assuming all

  • thers will then react by doing nothing

“Action space”

slide-19
SLIDE 19

Heitzig Learning to Cooperate via Conditional Commitments 19

Non-cooperative Game Theory

  • No binding agreements are possible
  • Players may use complex strategies

rather than just plain actions

  • A combination of strategies is a …
  • Nash equilibrium:

no individual player has an incentive to deviate unilaterally

→ many games have too many Nash equilibria → concept too weak

  • strong equilibrium:

no group of players has an incentive to deviate together

→ takes possibility to communicate into account, but may not exist…

slide-20
SLIDE 20

Heitzig Learning to Cooperate via Conditional Commitments 20

The Nash Program & Mechanism Design

Nash (1953): Reduce certain cooperative solutions (e.g. the “Nash bargaining solution”) to certain non-cooperative equilibria (e.g. “Markov-perfect equilibrium)

  • f suitable non-cooperative versions of a cooperative game!

(e.g. the Rubinstein bargaining protocol in case of bargaining) Mechanism Design: Construct a non-cooperative game form (e.g. a type of auction) so that in certain types of equilibrium, behaviour will meet a given goal! (e.g. revelation of preferences or maximal revenue for the auctioneer)

slide-21
SLIDE 21

Heitzig Learning to Cooperate via Conditional Commitments 21

Our achievement will be:

Nash (1953): Reduce certain coop. solutions to certain non-cooperative equilibria

  • f suitable non-cooperative versions of a cooperative game!

Mechanism Design: Construct a non-cooperative game form so that in certain types of equilibrium, behaviour will meet a given goal!

Based on the idea of conditional commitments, construct a game form for positive externality problems so that all strong equilibria correspond to core outcomes and hence agents behave in a “jointly optimal” way.

slide-22
SLIDE 22

Heitzig Learning to Cooperate via Conditional Commitments 22

A game form based on conditional commitments

Strategy spaces: Each player i chooses a conditional commitment function (CCF) ci = a map from others’ action combinations a–i to max. own actions ci (a–i) (interpretation: “if they do at least a–i , I do at least ai”) Outcome: each player is commited to perform the action ai given by the largest action profjle a that meets all conditions ai ≤ ci (a–i). Such a unique largest feasible action combination exists if all action spaces are supremum-complete partially ordered sets.

slide-23
SLIDE 23

Heitzig Learning to Cooperate via Conditional Commitments 23

Example: Output Reduction (Cournot Duopoly)

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

not a Nash equilibrium

slide-24
SLIDE 24

Heitzig Learning to Cooperate via Conditional Commitments 24

Example: Output Reduction (Cournot Duopoly)

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

not a Nash equilibrium each player has an incentive to switch to a difgerent CCF

a best reply CCF

  • f player 2

to player 1’s current CCF a best reply CCF

  • f player 1

to player 2’s current CCF

slide-25
SLIDE 25

Heitzig Learning to Cooperate via Conditional Commitments 25

Example: Output Reduction (Cournot Duopoly)

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point any mutually profitable point 1's indiff. curve through chosen point 2's indiff. curve through chosen point an equilibrium CCF for 1 an equilibrium CCF for 2

no Nash equilibrium Nash equilibrium but not strong

slide-26
SLIDE 26

Heitzig Learning to Cooperate via Conditional Commitments 26

Example: Output Reduction (Cournot Duopoly)

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point any mutually profitable point 1's indiff. curve through chosen point 2's indiff. curve through chosen point an equilibrium CCF for 1 an equilibrium CCF for 2

no Nash equilibrium Nash equilibrium but not strong the players have an incentive to switch together to difgerent CCFs

a better combination

  • f CCFs

for both

slide-27
SLIDE 27

Heitzig Learning to Cooperate via Conditional Commitments 27

Example: Output Reduction (Cournot Duopoly)

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point satisficers' outcome Pareto-efficient line 1's indiff. curves through focal point 2's indiff. curves through focal point maximizers' outcome

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

1's indiff. curve through Cournot point 2's indiff. curve through Cournot point any mutually profitable point 1's indiff. curve through chosen point 2's indiff. curve through chosen point an equilibrium CCF for 1 an equilibrium CCF for 2

a strong equilibrium that is even “focal”

slide-28
SLIDE 28

Heitzig Learning to Cooperate via Conditional Commitments 28

More players → Cournot Oligopoly

a “canonical” CCF for player 3 (= simple step function) “core” of this game

slide-29
SLIDE 29

Heitzig Learning to Cooperate via Conditional Commitments 29

Theorem 1

Assume the CCF mechanism is applied to any Costly Positive Externality Problem (CPEP, a certain fairly broad class of games) and certain fairly weak conditions apply. Then the outcomes that result from strong equilibria are exactly the “core” outcomes, and to sustain them it suffjces to use “canonical” CCFs.

(Heitzig 2019, seven times desk-rejected, ssrn.com/abstract=3449004)

slide-30
SLIDE 30

Heitzig Learning to Cooperate via Conditional Commitments 30

Other examples of CPEPs

  • Public good provision, e.g. emission reduction
  • Bilateral trade with a broker
  • Political package deals like Helmut Schmidt’s Bonn 1978 G7 deal

(discrete but high-dimensional action spaces)

  • Supply chains/networks with uncertain costs and capacities

(multi-dimensional continuous action spaces)

  • Commodity exchanges (e.g. electricity markets)

(many players having only few information)

(Heitzig 2019, ssrn.com/abstract=3449004)

slide-31
SLIDE 31

Heitzig Learning to Cooperate via Conditional Commitments 31

Overview

  • Motivation, Inspiration, Rationale, Example
  • Theoretical background
  • Formal results
  • Learning & Agent-based modeling
  • Simulation results
slide-32
SLIDE 32

Heitzig Learning to Cooperate via Conditional Commitments 32

Some forms of learning

  • Individual learning
  • based on own experiences

(e.g. trial and error, regret matching, reinforcement learning)

  • based on other’s behaviour (e.g. simple imitation)
  • based on other’s experiences

(e.g. “social” learning by observing others performance)

  • Collective learning
  • cooperatively (e.g. by sharing experiences)
  • non-cooperatively (e.g. by alternating trial and error)
slide-33
SLIDE 33

Heitzig Learning to Cooperate via Conditional Commitments 33

A simple agent-based model of collective learning using conditional commitment functions

All players start with a zero CCF At random time points, a random player updates her CCF:

  • she fjnds her “favourite” point x on the joint CCF of the other players
  • she determines

(1) her canonical CCF leading through x (2) her indifgerence curve through x

  • she uses any curve lying between the two curves as her new CCF

(thereby she ofgers to do more if others do more without risking a loss)

slide-34
SLIDE 34

Heitzig Learning to Cooperate via Conditional Commitments 34

Example from before

0.0 0.2 0.4 0.6 0.8 1.0 1's output reduction a1 0.0 0.2 0.4 0.6 0.8 1.0 2's output reduction a2

a continuous CCF for 1 a continuous CCF for 2 CCF mechanism outcome

her indifgerence curve through x her canonical CCF through x her new CCF player 2’s favourite point x

  • n player

1’s CCF = new

  • utcome

player 1’s CCF

  • ld outcome

player 2’s old CCF

slide-35
SLIDE 35

Heitzig Learning to Cooperate via Conditional Commitments 35

Simulation results for 3 countries’ GHG emissions reductions

Almost all runs converge very fast to a “core” outcome

Very rarely the process seems to get stuck with some player mitigating nothing (might be a numerical error)

“utilitarian” optimum (point of largest total utility) may lie outside the core

slide-36
SLIDE 36

Heitzig Learning to Cooperate via Conditional Commitments 36

Theorem 2

Assume all players use in some CPEP the above collective CCF-learning rule and some fairly weak conditions apply. Then the outcomes almost surely converge to a “core” outcome. Thank you! – Questions?

(Heitzig 2019, seven times desk-rejected, ssrn.com/abstract=3449004)