Cooperation rather than contention (instead - Coopetition Paradigms - - PowerPoint PPT Presentation

cooperation rather than contention instead coopetition
SMART_READER_LITE
LIVE PREVIEW

Cooperation rather than contention (instead - Coopetition Paradigms - - PowerPoint PPT Presentation

Cooperation rather than contention (instead - Coopetition Paradigms in Competitive Wireless Networks) Mihaela van der Schaar (with Fangwen Fu, Yi Su) Electrical Engineering, UCLA Multimedia Communications and Systems Lab


slide-1
SLIDE 1

Cooperation rather than contention (instead - Coopetition Paradigms in Competitive Wireless Networks)

Mihaela van der Schaar (with Fangwen Fu, Yi Su)

Electrical Engineering, UCLA Multimedia Communications and Systems Lab http://medianetlab.ee.ucla.edu/

slide-2
SLIDE 2

Motivation

  • Multi-user wireless networks

– Competitive environments

  • Strategic users compete for available resources to maximize their utility
  • Cooperation is required to increase both network and users’ efficiency

– Characterization of multi-user interaction in current literature

Coopetition

slide-3
SLIDE 3

New dimensions of multi-user interactions?

  • Heterogeneous users

different utility functions, which also change over time

various standards and architectures

ability to sense the environment and gather information

intelligence in determining actions and strategies (bounded rationality)

  • Dynamics

environment (channel, but also source and other competing users)

  • Differentiate among users’ non-collaborative behavior

strategic users – maximize their own utility

malicious users

bounded rationality

  • Consider impact of information decentralization

private information

information history – depends on the user’s observations/protocols

common knowledge – may differ

strategic message exchanges

  • Ability of users to learn

not single-agent, but multi-agent learning

  • Focus on equilibrium selection rather than characterization
slide-4
SLIDE 4

Information Knowledge-driven Decision Making Wireless User Central Spectrum Moderator (CSM)

  • r

Policy Maker Actions

{Incomplete}

Negotiation messages Resource Negotiation {Explicit/Implicit} {Myopic, Foresighted, etc.} Sensing

(existing techniques)

Learning Rules

{Fairness/Efficiency}

Wireless Network (Spectrum Access Market)

Possible new view for network design?

  • Design network interactions as dynamic, stochastic games

played among strategic and heterogeneous agents

  • The game is played with incomplete information and

heterogeneous, decentralized knowledge

  • Users can learn about their environment and competing users

based on observations or explicitly exchanged information

  • > Foresighted (proactive) participation in the competition for

resources rather than myopic adaptation

  • Collaboration needs to be mutually beneficial, not imposed

vdSchaar – NSF Career 2004

slide-5
SLIDE 5

Distributed stochastic games

Numerous networking/computing games:

  • Networks: power control games, contention games, peer-to-peer games etc.

Frequency-selective interference channels Goal: Optimal transmit PSD design that maximizes strategic users’ rates

Su, vdSchaar – 2007

slide-6
SLIDE 6

Multi-user Power Control Games in Interference Channels

  • Existing algorithms

– Non-cooperative solutions

  • Homogeneous users

– Cooperative solutions

  • Central controllers required
  • Non-convex optimization

Reached using iterative-waterfilling

  • Nash Equilibrium

– Best response in the competitive optimality sense

slide-7
SLIDE 7

Some Reflections…

  • Can we do better than Nash Eq. in decentralized

environments, and how?

  • Stackelberg stage game

– (Down, Left) : Nash equilibrium – (Up, Right) is better! – Heterogeneity in information availability

  • Nash: Myopic
  • Stackelberg: Foresighted
slide-8
SLIDE 8

Illustrative Results

User 1 User 2

20 40 60 80 100 120 140 160 180 200 2 4 6 8 frequency bins P1(f) IW Algorithm 20 40 60 80 100 120 140 160 180 200 2 4 6 8 frequency bins P1(f) Algorithm 1 20 40 60 80 100 120 140 160 180 200 2 4 6 8 frequency bins P2(f) IW Algorithm 20 40 60 80 100 120 140 160 180 200 2 4 6 8 frequency bins P2(f) Algorithm 1

  • Foresighted action:

– interference avoidance instead of water-filling

  • Collaboration – leads to improved performance for both users!

myopic foresighted Su, vdSchaar – 2007

slide-9
SLIDE 9

Multi-user Power Control Games in Interference Channels

  • Insights

– Value of information – Foresighted action is beneficial

even if others are myopic

(coopetition)

  • How to acquire the

required information?

– Estimation – Information exchange – Multi-agent Learning

Fu, vd Schaar - 2007 Su, vdSchaar – 2007

slide-10
SLIDE 10

Centralized general stochastic game model

Numerous networking games:

  • Networks: 802.11 nets – polling based, Cellular nets, Cognitive radio nets
  • Existing work focuses - on one stage, static games and equilibrium

characterization rather than on how to arrive at/select an equilibrium

slide-11
SLIDE 11

How to play the stochastic game?

  • Observation – part of the game’s history
  • Policy
  • Best response:

t t i ⊂

ο h

t t i i

  • O

,

:

t t i i i i

π ×

  • O

A B

[ ]

, ( )

t t t t i i i i

a b π =

  • (

,..., ,...)

t i i i

π π = π

( ) ( )

1,...,

,

t t t t t M i i

π π π

= = π π

,

( ) arg max (( , ) | , )

i

t t t t t t i i i i

  • i

Q w

π

β π

= s π π

How to solve this problem? Multi-agent learning!

Fu, vdSchaar – 2006

slide-12
SLIDE 12

Multi-agent learning - definition

We define a learning algorithm

i

L as:

( )

, , , ,

i i

t t t t t t t i i i i w

a b s B B B π

− −

⎡ ⎤ = ⎢ ⎥ ⎣ ⎦

s π

Output of the multi-user interaction game: ( )

, ,

t t t t

Game w Ω = s a

Observation of SU i ( )

, ,

t t t t i i i i

  • O s

b = Ω

,

where O is the observation function which depends on the current state, the current game output and the current internal action taken.

Policy update:

( )

+1

, ,

t t t t i i i i i

  • I

π π

= F F is the update function about the belief and policies

t i

I− is the exchanged information with the other SUs

Beliefs about the other SUs’ states

i −

s

, policies

i −

π

and the network resource state w : ( )

+1

, ,

i i i

t t t t i i

B B

  • I

− − −

=

π π π

F

,

( )

+1

, ,

t t t t w w w i i

B B

  • I−

= F

,

( )

+1

, ,

i i i

t t t t i i

B B

  • I

− − −

=

s s s

F

slide-13
SLIDE 13

Multi-agent learning - illustration

i

π

t i

  • t

i

s

t t i i

a ,b

i −

I

Explicit information exchange

Solutions depend on the information availability:

  • Reinforcement learning (no explicit modeling of other users)
  • Fictitious Play (explicit modeling of other users – needs to know

what actions opponents took, but not their strategies)

  • Regret Matching
  • Model-based

Fu, vdSchaar – 2006 Shiang, vdSchaar – 2007

slide-14
SLIDE 14

Multi-agent learning - illustration

i

π

t i

  • t

i

s

t t i i

a ,b

i −

I

Explicit information exchange

Value of Learning

( )

( )

,

, 1

1 ( ) ( )

i i i i i i i

T t i i t

T R T

π

π

− −

=

=

L

L

V

  • I
  • I

How much to learn for a desired performance (utility)?

slide-15
SLIDE 15

Proposed Goal

* Next generation multi-user networks should explicitly consider strategic behavior of users, dynamics, heterogeneity, information availability and decentralized knowledge * Collaboration should be reached by mutual agreement rather than being imposed on users

  • Opens opportunities for new theoretical foundations

and algorithm designs, new metrics needed

  • Performance improvements
  • Backwards compatibility to existing protocols should

be respected

slide-16
SLIDE 16

For more information

  • n our research paper on

Learning, Distributed Decision Making and Games in Networking and Computing Systems see our group’s website: http://medianetlab.ee.ucla.edu/