CS 680: GAME AI WEEK 4: DECISION MAKING IN RTS GAMES 2/6/2012 - - PowerPoint PPT Presentation

cs 680 game ai
SMART_READER_LITE
LIVE PREVIEW

CS 680: GAME AI WEEK 4: DECISION MAKING IN RTS GAMES 2/6/2012 - - PowerPoint PPT Presentation

CS 680: GAME AI WEEK 4: DECISION MAKING IN RTS GAMES 2/6/2012 Santiago Ontan santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2012/CS680/intro.html Reminders Projects: Project 1 is simpler than it seems: 1)


slide-1
SLIDE 1

CS 680: GAME AI

WEEK 4: DECISION MAKING IN RTS GAMES

2/6/2012 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2012/CS680/intro.html

slide-2
SLIDE 2

Reminders

  • Projects:
  • Project 1 is simpler than it seems:
  • 1) Implement a basic AI (doesn’t have to play well)
  • 2) Pick a path-finding/decision making algorithm and experiment with it
  • Progress self-check indicator:
  • Your progress is good is you already have a basic AI that can play

a complete game of S3/Starcraft (whether it wins or not).

slide-3
SLIDE 3

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-4
SLIDE 4

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-5
SLIDE 5

Decision Making

  • A situation is characterized by:
  • Known information about the state of the world
  • Unknown information about the state of the world
  • Set of possible actions to execute
  • Problem:
  • Given a situation, which of the

possible actions is the best?

slide-6
SLIDE 6

Example: RTS Games

  • Known information:
  • Player data, explored terrain
  • Unknown:
  • Unexplored terrain
  • Enemy strategy
  • Actions:
  • Build barracks
  • Build refinery
  • Build supply depot
  • Wait
  • Explore
slide-7
SLIDE 7

Example: Final Fantasy VI

  • Known information:
  • Party information
  • Two enemies
  • Unknown:
  • Resistances
  • Attack power
  • Remaining health
  • Actions:
slide-8
SLIDE 8

Basic RTS AI Diagram

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter

slide-9
SLIDE 9

Basic RTS AI Diagram

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter Decision making is key at these levels

slide-10
SLIDE 10

Basic RTS AI Diagram

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter And is most important at the high-level strategy level

slide-11
SLIDE 11

Example Basic RTS AI: Strategy

  • Finite-State Machine

After training 4 footmen Resource spending: 80% Economy, 20% Military 2 workers wood 1 worker metal Army Composition: 100% footmen Resource spending: 20% Economy, 80% Military 2 workers wood 4 workers metal Army Composition: 100% knights Army Composition: 50% knights 50% archers If enemy has Flying units If enemy has no more flying units

slide-12
SLIDE 12

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-13
SLIDE 13

Finite State Machines

Harvest Minerals Build Barracks Train marines Attack Enemy Explore Train SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen 4 marines & Enemy unseen No marines

slide-14
SLIDE 14

Finite State Machines

  • Easy to implement:

switch(state) { case START: if (numSCVs<4) state = TRAIN_SCVs; if (numHarvestingSCVs>=4) state = BUILD_BARRACKS; Unit *SCV = findIdleSCV(); Unit *mineral = findClosestMineral(SCV); SCV->harvest(mineral); break; case TRAIN_SCVs: if (numSCVs>=4) state = START; Unit *base = findIdleBase(); base->train(UnitType::SCV); break; case BUILD_BARRACKS: … }

slide-15
SLIDE 15

Basic RTS AI Diagram

Perception Strategy/ Give Orders Execute Orders Unit Analysis Map Analysis Unit AI Building Placer Pathfinder Unit AI Unit AI

Harvest Minerals Build Barracks Train marines Attack Enemy Explore Train SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen 4 marines & Enemy unseen No marines

slide-16
SLIDE 16

Basic RTS AI Diagram

Perception Strategy/ Give Orders Execute Orders Unit Analysis Map Analysis Unit AI Building Placer Pathfinder Unit AI Unit AI

Harvest Minerals Build Barracks Train marines Attack Enemy Explore Train SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen 4 marines & Enemy unseen No marines

For Simple games or simple AIs, we could substitute both Strategy and “Give Orders” layers by a FSM

slide-17
SLIDE 17

Basic RTS AI Diagram

Perception Strategy/ Give Orders Execute Orders Unit Analysis Map Analysis Unit AI Building Placer Pathfinder Unit AI Unit AI

Harvest Minerals Build Barracks Train marines Attack Enemy Explore Train SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen 4 marines & Enemy unseen No marines

For mode complex AIs, FMSs are too restrictive, and it’s better to use the architecture as we explained it in Week 2 of class

slide-18
SLIDE 18

Finite State Machines

  • Good for simple AIs
  • Become unmanageable for complex tasks
  • Hard to maintain, Example:
  • Imagine we want to add the behavior
  • if “enemy inside base” then “attack him with everything we have”
  • We will have to add a new state and transitions from every other

state!

slide-19
SLIDE 19

Finite State Machines (Add a new state)

Harvest Minerals Build Barracks Train 4 marines Attack Enemy Explore Train 4 SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen Enemy unseen No marines Attack Inside Enemy Enemy Inside Base

slide-20
SLIDE 20

Standard Strategy

Hierarchical Finite State Machines

Attack Inside Enemy Enemy Inside Base No Enemy Inside Base

  • FSM inside of the state of another FSM
  • As many levels as needed
  • Can alleviate complexity problem to some extent
slide-21
SLIDE 21

Hierarchical Finite State Machines

Harvest Minerals Build Barracks Train 4 marines Attack Enemy Explore Train 4 SCVs 4 SCVs harvesting 4 SCVs Less than 4 SCVs barracks 4 marines & Enemy seen Enemy seen Enemy unseen No marines

Attack Inside Enemy Enemy Inside Base No Enemy Inside Base

  • FSM inside of the state of another FSM
  • As many levels as needed
  • Can alleviate complexity problem to some extent
slide-22
SLIDE 22

Decision Trees

  • In the FSM examples before, decisions were quite simple:
  • If “4 SCVs” then “build barracks”
  • But those conditions can easily become complex
  • Decision trees offer a way to encode complex decisions in

a easy and organized way

slide-23
SLIDE 23

Example of Complex Decision

  • Decide when to attack the enemy in a RTS game, and what

kind of units to build

  • We could try to define a set of rules:
  • If we have not seen the enemy then build ground units
  • If we have seen the enemy and he has no air units and we have more

units than him, then attack

  • If we have seen the enemy and he has air units and we do not have air

units, then build air units

  • etc.
  • Problems:
  • complex to know if we are missing any scenario,
  • The conditions of the rules might grow very complex
slide-24
SLIDE 24

Example of Complex Decision: Decision Tree

Enemy Seen Does he have Air Units Do we have Antiair units Do we have More units Than him Do we have More units Than him Build Antiair units Attack! Build more units Attack! Build more units Build more units no no no no no yes yes yes yes yes

  • The same decision, can

be easily captured in a decision tree

slide-25
SLIDE 25

Decision Trees

  • Intuitive
  • Help us determine whether we are forgetting a case
  • Easy to implement:
  • Decision trees can be used as “paper and pencil” technique, to

think about the problem, and then just use nested if-then-else statements

  • They can also be implemented in a generic way, and give graphical

editors to game designers

slide-26
SLIDE 26

Finite State Machines with Decision Trees

  • In complex FSMs, conditions in arches might get complex
  • Each state could have a decision tree to determine which

state to go next

S1 C1 C2 S2 S3

slide-27
SLIDE 27

Example Basic RTS AI: Strategy

Finite-State Machine used as an example in Week 2 of class

After training 4 footmen Resource spending: 80% Economy, 20% Military 2 workers wood 1 worker metal Army Composition: 100% footmen Resource spending: 20% Economy, 80% Military 2 workers wood 4 workers metal Army Composition: 100% knights Army Composition: 50% knights 50% archers If enemy has Flying units If enemy has no more flying units

slide-28
SLIDE 28

Other Approaches

  • Rule-based systems:
  • Not extremely common in games, but very well studied in AI (expert

systems)

  • Collection of rules plus an inference engine
  • Problems: hard to scale up (rules have complex interactions when

there’s many of them)

  • Behavior Trees:
  • Combination of Hierarchical FSMs with planning and execution
  • Very popular in modern games (not so popular in RTS games)
  • Covered in Intro to Game AI (offered next quarter)
slide-29
SLIDE 29

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-30
SLIDE 30

Authoring AI vs Autonomous AI

  • FSMs, Decision Trees, Rule-based Systems, Behavior

Trees, etc. are useful to hardcode decisions:

  • Game designer tools to make the AI behave the way they want
  • The AI will never do anything the game designers didn’t foresee

(except for bugs)

  • We will now change our attention to techniques that can

be used to let the AI autonomously take decisions

  • The AI takes decisions on its own, and can generate strategies, not

foreseen by game designers

slide-31
SLIDE 31

Decision Theory

  • Given a situation, decide which action to perform

depending on the desirability of its immediate outcome

  • Desirability of a situation: utility function U(s)
  • Decision theory is based upon the idea that there is such utility

function

  • Example utility function, Chess:
  • Score of a player: 10 points for the queen + 5 points per rook, + 3

points per knight or bishop + 1 point per pawn.

  • Utility for white pieces: Uw(s) = Score(white) – Score(black)
  • Utility for black pieces: Ub(s) = Score(black) – Score(white)
slide-32
SLIDE 32

Example Utility Function for RTS games:

  • Similarly to chess, we can do:
  • Enemy units must be estimated
  • This is oversimplified, but it is good as a first approach. It

can be improved by adding:

  • Resources: (minerals, gas, gold, wood, etc.)
  • Research done
  • Territory under control

U(s) = X

t

wt(cfriendly

t

− cenemy

t

)

slide-33
SLIDE 33

Is it Realistic to Have a Utility Function?

  • Yes:
  • In hardcoded approaches (FSM, Decision trees, etc.):
  • The AI designer has to tell the AI HOW to play
  • Utility function:
  • Only captures the goal of the game
  • In the simplest form, utility is simply 1 for win, -1 for loss, and 0
  • therwise
  • But the more information conveyed in the utility function, the better the

AI can decide what to do

  • The AI designer has to tell the AI WHAT is the goal
  • It is easier to define a utility function, than to hardcode a strategy,

since utility function has less information (only WHAT, not HOW)

slide-34
SLIDE 34

Decision Theory

  • Effect of an action a on the state s: Result(s,a)
  • Since we might not know the exact state in which we are,

we can only estimate the result of an action:

  • P(Result(s,a) = s’ | e)

(e is the information we know about s)

  • Example:
  • a = “attack enemy supply depot with 4 marines”
  • e = “we haven’t observed any enemy unit around the supply depot”
  • Result(s,a) = “supply depot destroyed in 250 cycles, 4 marines intact”
  • But we don’t know if there were cloaked units, so we can only guess

the result of a

slide-35
SLIDE 35

Maximum Expected Utility Principle (MEU)

  • Select the action with the expected maximum utility:
  • Requires:
  • Utility function (hardcoded or machine learned)
  • Estimation of action effects (hardcoded or machine learned)
  • The AI has to know what the actions do!

EU(a|e) = X

s0

P(Result(a, s) = s0|e)U(s0)

slide-36
SLIDE 36

Example: Target Selection

  • Utility function:
  • 60 points per footman
  • 400 points per barracks
  • 200 points per lumber mill
  • Which action?
  • Attack enemy footman
  • Attack enemy barracks
  • Attack enemy lumber mill

Player = blue Enemy = red

slide-37
SLIDE 37

Example: Target Selection

  • Utility function:
  • 60 points per footman
  • 400 points per barracks
  • 200 points per lumber mill
  • Which action?
  • Attack enemy footman:
  • 2 footmen can kill 1 footman
  • U(s’) = 2*60 – 400 – 200 = -480

Player = blue Enemy = red

slide-38
SLIDE 38

Example: Target Selection

  • Utility function:
  • 60 points per footman
  • 400 points per barracks
  • 200 points per lumber mill
  • Which action?
  • Attack enemy barracks:
  • During the time it takes to destroy

the barracks, the enemy footman can kill our 2 footmen

  • U(s’) = -60 – 400 – 200 = -660

Player = blue Enemy = red

slide-39
SLIDE 39

Example: Target Selection

  • Utility function:
  • 60 points per footman
  • 400 points per barracks
  • 200 points per lumber mill
  • Which action?
  • Attack enemy lumber mill:
  • During the time it takes to destroy

the lumber mill, the enemy footman can kill our 2 footmen

  • U(s’) = -60 – 400 – 200 = -660

Player = blue Enemy = red

slide-40
SLIDE 40

Example: Target Selection

  • Utility function:
  • 60 points per footman
  • 400 points per barracks
  • 200 points per lumber mill
  • Which action?
  • Attack enemy footman: -480
  • Attack enemy barracks: -660
  • Attack enemy lumber mill: -660

Player = blue Enemy = red

slide-41
SLIDE 41

Basic RTS AI Diagram

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter Decision Theory can be useful at these three levels. Notice that, as presented here, it only considers one action at a time. If used in the Attack module, it is more natural to have

  • ne “decision theoretic

module” per “squad”, so each can take individual decisions.

slide-42
SLIDE 42

Value of Information

  • How do we know when is it worth spending resources in

exploring?

  • Value of perfect information:
  • i.e.:
  • How much utility can we expect to gain if we knew the value of an

unknown variable E (that can take k different values e1, …, ek)

VPI e(E) = X

k

P(E = ek)EU (a∗

k|e, E = ek)

! − EU (a∗|e)

slide-43
SLIDE 43

Example

  • Starcraft:
  • Player force: 4 marines (60 points each)
  • 1 Enemy Command Center spotted (500 points)
  • Enemy defenses: unknown
  • Which action to perform?
  • Attack
  • Train more marines

4 marines command center

?

slide-44
SLIDE 44

Example

  • Starcraft:
  • Player force: 4 marines (60 points each)
  • 1 Enemy Command Center spotted (500 points)
  • Enemy defenses: unknown
  • Which action to perform?
  • Attack:
  • U(s’) = 0.5 U(winning) + 0.5 U(losing) =
  • U(s’) = 0.5 * (4 marines) + 0.5 (-1 command center) = -230
  • Train more marines:
  • U(s’) = 6 marines – (1 command center + 2 SCVs) = -380

4 marines command center

?

slide-45
SLIDE 45

Example

  • Starcraft:
  • Player force: 4 marines (60 points each)
  • 1 Enemy Command Center spotted (500 points)
  • Enemy defenses: unknown
  • Which action to perform?
  • Attack:
  • U(s’) = 0.5 * (4 marines) + 0.5 (-1 command center) = -130
  • Train more marines:
  • U(s’) = 6 marines – (1 command center + 2 SCVs) = -380
  • If we know no defenses, then for Attack:
  • U(s’) = 4 marines = 240
  • If we know there are defenses, then for Attack:
  • U(s’) = -1 command center = -500

4 marines command center

?

EU(Attack | Defenses) = -500 EU(Attack | no Defenses) = 240

slide-46
SLIDE 46

Example

  • Starcraft:
  • Player force: 4 marines (60 points each)
  • 1 Enemy Command Center spotted (500 points)
  • Enemy defenses: unknown
  • Which action to perform?
  • Attack:
  • U(s’) = 0.5 * (4 marines) + 0.5 (-1 command center) = -230
  • Train more marines:
  • U(s’) = 6 marines – (1 command center + 2 SCVs) = -380

VPI(defenses) = (0.5 * EU(More Marines | Defenses) + 0.5 * EU(Attack | No Defenses) ) – (EU(Attack))

4 marines command center

?

slide-47
SLIDE 47

Example

  • Starcraft:
  • Player force: 4 marines (60 points each)
  • 1 Enemy Command Center spotted (500 points)
  • Enemy defenses: unknown
  • Which action to perform?
  • Attack:
  • U(s’) = 0.5 * (4 marines) + 0.5 (-1 command center) = -230
  • Train more marines:
  • U(s’) = 6 marines – (1 command center + 2 SCVs) = -380

VPI(defenses) = (0.5 * -380 + 0.5 * 240 ) – (-130) = -70 + 130 = 60

4 marines command center

?

slide-48
SLIDE 48

Decision Theory

  • Basic principles for taking rational decisions
  • Goal of game AI is to be fun:
  • Utility function doesn’t have to be tuned to “optimal play”, but to

“fun play”

  • For example, utility function may penalize too many attacks to the

player in a given period of time, in order to let the player breathe.

  • Deals only with immediate utility. No look ahead, or

adversarial planning:

  • To deal with that: adversarial search (next section)
slide-49
SLIDE 49

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-50
SLIDE 50

Adversarial Search

  • Decision theory is good for lower reactive decisions (e.g.

Attack module)

  • At a higher level (strategy), better decisions could be

made if the AI could plan for future actions:

  • E.g.: If I attack with 4 marines, I’ll be left with 2, then the enemy will
  • verpower me with his 4 Dragoons. In that case I could lure them

into the upper defenses, and annihilate them with my 8 tanks.

  • Solution: game tree search
slide-51
SLIDE 51

Game Tree

Current Situation Player 1 action U(s) U(s) U(s)

  • Decision theory deals with immediate decisions:
slide-52
SLIDE 52

Game Tree

Current Situation Player 1 action U(s) U(s) U(s)

  • Decision theory deals with immediate decisions:

Pick the action that leads to the state with maximum expected utility

slide-53
SLIDE 53

Game Tree

Current Situation Player 1 action Player 2 action U(s) U(s) U(s) U(s) U(s) U(s)

  • Game trees capture the effects of successive action

executions:

slide-54
SLIDE 54

Game Tree

Current Situation Player 1 action Player 2 action U(s) U(s) U(s) U(s) U(s) U(s)

  • Game trees capture the effects of successive action

executions:

Pick the action that leads to the state with maximum expected utility after taking into account what the other players might do

slide-55
SLIDE 55

Game Tree

Current Situation Player 1 action Player 2 action U(s) U(s) U(s) U(s) U(s) U(s)

  • Game trees capture the effects of successive action

executions:

In this example, we look ahead only one player 1 action and

  • ne player 2 action.

But we could grow the tree arbitrarily deep

slide-56
SLIDE 56

Minimax Principle

Current Situation Player 1 action Player 2 action U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0

  • Positive utility is good for player 1, and negative for player 2
  • Player 1 chooses actions that maximize U, player 2 chooses

actions that minimize U

slide-57
SLIDE 57

Minimax Principle

Current Situation Player 1 action Player 2 action (min) U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0

  • Positive utility is good for player 1, and negative for player 2
  • Player 1 chooses actions that maximize U, player 2 chooses

actions that minimize U

slide-58
SLIDE 58

Minimax Principle

Current Situation Player 1 action Player 2 action (min) U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0

  • Positive utility is good for player 1, and negative for player 2
  • Player 1 chooses actions that maximize U, player 2 chooses

actions that minimize U

U(s) = -1 U(s) = -1 U(s) = 0

slide-59
SLIDE 59

Minimax Principle

Current Situation Player 1 action (max) Player 2 action (min) U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0

  • Positive utility is good for player 1, and negative for player 2
  • Player 1 chooses actions that maximize U, player 2 chooses

actions that minimize U

U(s) = -1 U(s) = -1 U(s) = 0

slide-60
SLIDE 60

Minimax Algorithm

Minimax(state, player, MAX_DEPTH) IF MAX_DEPTH == 0 RETURN U(state) BestAction = null BestScore = null FOR Action in actions(player, state) (Score,Action2) = Minimax(result(action, state), nextplayer(player), MAX_DEPTH-1) IF BestScore == null || (player == 1 && Score>BestScore) || (player == 2 && Score<BestScore) BestScore = Score BestAction = Action ENDFOR RETURN (BestScore, BestAction)

slide-61
SLIDE 61

Minimax Algorithm

  • Needs:
  • Utility function U
  • Way to determine which actions can a player execute in a given state
  • MAX_DEPTH controls how deep is the search tree going to be:
  • Size of the tree is exponential in MAX_DEPTH
  • Branching factor is the number of moves that can be executed per

state

  • The higher MAX_DEPTH, the better the AI will play
  • There are ways to increase speed: alpha-beta pruning
slide-62
SLIDE 62

Successes of Minimax

  • Deep Blue defeated Kasparov in Chess (1997)
  • Checkers was completely solved by Jonathan Shaeffert

(2007):

  • If no players make mistakes, the game is a draw (like tick-tack-toe)
  • Go:
  • Using a variant of minimax, based on Monte Carlo search (UCT),
  • In 2011 The program Zen19S reached 4 dan (professional humans

are rated between 1 to 9 dan)

slide-63
SLIDE 63

Game Tree Search in RTS Games

  • Classic minimax assumes (Chess, Checkers, Go…):
  • 2 players
  • Perfect information
  • Turn-taking game
  • Given a state and an action, we can predict the next state
  • It is easily generalizable to a multiplayer turn-taking game

(max^n algorithm)

  • RTS games:
  • Real-time, not turn-taking, simultaneous actions
  • Lots of possible actions: branching factor too large!
  • We cannot exactly predict the next state
  • Imperfect information
slide-64
SLIDE 64

Game Tree Search in RTS Games

  • Problem:
  • Lots of possible actions, branching factor too large!
  • Solution:
  • ???
  • Problems:
  • real-time, no turn taking, simultaneous actions
  • Solution:
  • ???
slide-65
SLIDE 65

Game Tree Search in RTS Games

  • Problem:
  • Lots of possible actions, branching factor too large!
  • Solution:
  • Sampling (Monte-Carlo Search)
  • Problems:
  • real-time, no turn taking, simultaneous actions
  • Solution:
  • ???
slide-66
SLIDE 66

Monte-Carlo Tree Search: UCT

  • Monte-Carlo Search:
  • For each possible action: play N games at random until the end

starting with each action

  • If N is large, the average win ratio converges to the expected utility of

the action

  • Upper Confidence Tree (UCT) is a state of the art, simple

variant of Monte-Carlo Search, responsible for the recent success of Computer Go programs

  • Idea:
  • Instead of opening the whole Minimax tree or play N random games:
  • Open only the upper part of the tree, and play random games from

there

slide-67
SLIDE 67

Minimax vs Monte-Carlo

Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U

slide-68
SLIDE 68

Minimax vs Monte-Carlo

Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U Minimax opens the complete tree (all possible moves) up to a fixed depth. Then, the Utility function is applied to the leaves.

slide-69
SLIDE 69

Minimax vs Monte-Carlo

Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U Monte-Carlo search runs for each possible move at the root node a fixed number K of random complete games. No need for a Utility function (but it can be used), Complete Game

slide-70
SLIDE 70

UCT

0/0 Tree Search Monte-Carlo Search Current state w/t is the account of how many games starting from this state have be found to be won out of the total games explored in the current search Current State

slide-71
SLIDE 71

UCT

1/1 Tree Search Monte-Carlo Search win

slide-72
SLIDE 72

UCT

1/2 Tree Search Monte-Carlo Search 0/1 loss At each iteration, one node of the tree (upper part) is selected and expanded (one node added to the tree). From this new node a complete game is played out at random (Monte-Carlo)

slide-73
SLIDE 73

UCT

2/3 Tree Search Monte-Carlo Search 0/1 At each iteration, one node of the tree (upper part) is selected and expanded (one node added to the tree). From this new node a complete game is played out at random (Monte-Carlo) 1/1 win

slide-74
SLIDE 74

UCT

3/4 Tree Search Monte-Carlo Search 0/1 2/2 1/1 win The counts w/t are used to determine which nodes to explore next. Exploration/Exploitation: 50% expand the best node in the tree 50% expand a node at random

slide-75
SLIDE 75

UCT

3/5 Tree Search Monte-Carlo Search 0/1 2/3 1/1 0/1 loss The tree ensures all relevant actions are explored (greatly alleviates the randomness that affects Monte-Carlo methods)

slide-76
SLIDE 76

UCT

3/5 Tree Search Monte-Carlo Search 0/1 2/3 1/1 0/1 loss The random games played from each node of the tree serve to estimate the Utility function. They can be random, or use an

  • pponent model (if available)
slide-77
SLIDE 77

UCT

  • After a fixed number of iterations K (or after the assigned

time is over), UCT analyzes the resulting trees, and the selected action is that with the highest win ratio.

  • UCT can search in games with much larger state spaces

than minimax. It is the standard algorithms for modern (from 2008 to present) Go playing programs

slide-78
SLIDE 78

Game Tree Search in RTS Games

  • Problem:
  • Lots of possible actions, branching factor too large!
  • Solution:
  • Sampling (Monte-Carlo Search)
  • Problems:
  • real-time, no turn taking, simultaneous actions
  • Solution:
  • Strategy simulation, rather than turn-based action taking
slide-79
SLIDE 79

Strategy Simulation: Example

  • Assume we want to use UCT for the Strategy module of

an RTS AI game

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter

slide-80
SLIDE 80

Strategy Simulation: Example

  • Assume we want to use UCT for the Strategy module of

an RTS AI game

  • Define a collection of “high level actions” (or strategies)

that make sense for the game. For example, in S3:

  • S1: Attack with the units we have
  • S2: Train 4 footmen
  • S3: Train 4 archers
  • S4: Train 4 catapults
  • S5: Train 4 knights
  • S6: Build 2 defense Towers
  • S7: Build 2 defense Towers around a Gold Mine
  • S8: Build 2 defense Towers around a group of Trees
  • S9: Bring units back to the base
  • S10: Train 2 more peasants to gather resources
slide-81
SLIDE 81

Strategy Simulation: Example

  • Instead of taking turns in executing actions, we assign a

“strategy” to each player, and simulate it until completion:

Player 1, Action 1 Player 2, Action 2 Player 1, Action 3 Player 1: S2 (ETA 240) Player 2: S3 (ETA 400) Player 1: S1 (ETA 400) Player 2: S3 (ETA 160) Player 1: S1 (ETA 240) Player 2: S1 (ETA 400) Player 1: S1 Player 2: S1 Standard Minimax Strategy Simulation

slide-82
SLIDE 82

Strategy Simulation

  • Requires:
  • A way to simulate strategies: typically a very simplified model
  • E.g. battles just decided by who has more units, or added damage of

units (taking into account air/ground units)

  • No pathfinding, etc.
  • Abstracted version of the game, e.g.: divide map in regions, and just

count the number of unit types in each region

  • Utility function (optional):
  • If available, there is no need to simulate games till the end when using

Monte-Carlo

  • If not available, simply simulate games to the end
slide-83
SLIDE 83

UCT for RTS Games

  • Applicable to:
  • Strategy (previous example)
  • Attack: where the high-level actions are things like attack enemy X,

retreat, etc.

  • Economy
  • In Turn-based games, minimax is executed each turn
  • For RTS games: execute each K cycles (e.g. once per second), or
  • nce the current action has finished, or an important event

happened (e.g. new enemy sighted)

  • State of the art:
  • No current commercial games use it
  • Research in experimental games shows its potential
slide-84
SLIDE 84

Overview of Decision Making

  • Hardcoded:
  • FSM/Decision Trees: good for simple AIs and for game designers

to author the exact behavior they want the AI to have

  • AI author needs to decide HOW they AI plays
  • Autonomous:
  • Decision Theory: good for reactive behavior (e.g. unit/squad

control, Attack module)

  • Adversarial Search: good for high-level strategy
  • AI author only needs to decide WHAT the AI wants to accomplish,

the AI will figure out HOW automatically

slide-85
SLIDE 85

RTS Game AI Overview

  • Week 2: How to create a basic RTS AI
  • Hierarchical system:
  • Strategy: FSM
  • Giving Orders: Different modules (Economy, Logistics, Attack)
  • Pathfinding
  • Building Placer
  • Week 3: Pathfinding
  • A* or TBA* / LRTA* for small games
  • D* Lite for larger games with very dynamic environments
  • Week 4: Decision Making
  • FSMs / Decision Trees: for hardcoding AI behavior (useful for simple

games where the game designers can control what they AI does)

  • Decision Theory / Minimax: to let the AI decide what to do in order to

maximize the utility function

slide-86
SLIDE 86

RTS Game AI Overview

Perception Strategy Give Orders Execute Orders Unit Analysis Map Analysis Strategy Economy Logistics Attack Unit AI Building Placer Pathfinder Unit AI Unit AI Arbiter

slide-87
SLIDE 87

Outline

  • Student Presentation:

“Near Optimal Hierarchical Pathfinding”

  • Student Presentation:

“Intelligent Moving of Groups in Real-Time Strategy Games”

  • Decision Making
  • Basics: Hardcoded Methods
  • Decision Theory
  • Adversarial Search
  • Project Discussion
slide-88
SLIDE 88

Project 1: RTS Games

  • Issues with Starcraft / BWAPI?
  • http://code.google.com/p/bwapi/wiki/UsingBWAPI
  • Issues with S3?
  • Questions about Pathfinding?
  • Options for Project 1 (pick one):
  • Pathfinding:
  • TBA*,
  • LRTA*,
  • D* Lite
  • Decision Making:
  • Utility Function for High-level strategy or attack.
  • Game Tree Search (minimax, Monte Carlo, or UCT) for strategy or attack
  • Other (with permission from instructor)
slide-89
SLIDE 89

Project 2: Drama Management

  • IFGameEngine demo