Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam - - PowerPoint PPT Presentation

β–Ά
artificial intelligence in robotics
SMART_READER_LITE
LIVE PREVIEW

Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam - - PowerPoint PPT Presentation

Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam Lis Artificial Intelligence Center Department of Computer Science, Faculty of Electrical Eng. Czech Technical University in Prague Mathematical programming LP MILP Some of


slide-1
SLIDE 1

Artificial Intelligence in Robotics

Lecture 13: Patrolling Viliam LisΓ½

Artificial Intelligence Center Department of Computer Science, Faculty of Electrical Eng. Czech Technical University in Prague

slide-2
SLIDE 2

Mathematical programming

LP MILP

Some of the variables are integer Objective and constraints are still linear

Convex program

Optimize a convex function over a convex set

Non-convex program

2

slide-3
SLIDE 3

Task Taxonomy

3

Robin, C., & Lacroix, S. (2016). Multi-robot target detection and tracking: taxonomy and survey. Autonomous Robots, 40(4), 729–760.

slide-4
SLIDE 4

Resource allocation games

Developed by team of prof. M. Tambe at USC (2008-now) In daily use by various organizations and security agencies

4

slide-5
SLIDE 5

Resource allocation games

5

3 2 4 5 1 6 7 8

Unprotected 10 11 9 15 11 15 14 6 Protected 5 4 5 7 6 5 7 3

  • 15
  • 14
  • 11
  • 10

Optimal strategy 0 0.14 0 0.62 0.2 0.49 0.56 0

slide-6
SLIDE 6

Resource allocation games

Set of targets: π‘ˆ = 𝑒1, … , π‘’π‘œ Limited (homogeneous) security resources 𝑠 ∈ β„•

Each resource can fully protect (cover) a single target

The attacker attacks a single target Attacker’s utility for covered/uncovered attack: 𝑉𝑏

𝑑 𝑒 < 𝑉𝑏 𝑣 𝑒

Defender’s utility for covered/uncovered attack: 𝑉𝑒

𝑑 𝑒 > 𝑉𝑒 𝑣(𝑒)

6

slide-7
SLIDE 7

Stackelberg equilibrium

the leader π‘š – publicly commits to a strategy the follower (𝑔) – plays a best response to leader arg max

πœπ‘šβˆˆΞ” π΅π‘š ; πœπ‘”βˆˆπΆπ‘†π‘”(πœπ‘š) 𝑠 π‘š(πœπ‘š, 𝜏 𝑔)

Example Why?

The defender needs to commit in practice (laws, regulations, etc.) It may lead to better expected utility

7

L R U (4,2) (6,1) D (3,1) (5,2)

slide-8
SLIDE 8

Solving resource allocation games

Kiekintveld, et al.: Computing Optimal Randomized Resource Allocations for Massive Security Games, AAMAS 2009 Only coverage vector 𝑑𝑒 matters, π‘Ž is a sufficiently large number

8

slide-9
SLIDE 9

Sampling the coverage vector

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 t1 t2 t3 t4 t5 t6

c

9

𝑠

1

𝑠

2

1

slide-10
SLIDE 10

Scalability

25 resources, 3000 targets => 5 Γ— 1061 defender’s actions

no chance for matrix game representation

The algorithm explained above is ERASER

10

slide-11
SLIDE 11

Studied extensions

Complex structured defender strategies Probabilistically failing actions Attacker’s types Resource types and teams Bounded rational attackers

11

slide-12
SLIDE 12

Resource allocation (security) games

Advantages

Wide existing literature (many variations) Good scalability Real world deployments

Limitation

The attacker cannot react to observations (e.g., defender’s position)

12

slide-13
SLIDE 13

Perimeter patrolling

Agmon et al.: Multi-Robot Adversarial Patrolling: Facing a Full- Knowledge Opponent. JAIR 2011.

13

The attacker can see the patrol!

slide-14
SLIDE 14

Perimeter patrolling

Polygon 𝑄, perimeter split to 𝑂 segments Defender has homogenous resources 𝑙 > 1

move 1 segment per time step turn to the opposite direction in 𝜐 time steps

Attacker can wait infinitely long and sees everything

chooses a segment where to attack requires 𝑒 time steps to penetrate

14

slide-15
SLIDE 15

Interesting parameter settings

Let 𝑒 =

𝑂 𝑙 be the distance between equidistant robots

There is a perfect deterministic patrol strategy if 𝑒 β‰₯ 𝑒

the robots can just continue in one direction

What about 𝑒 =

4 5 𝑒 ?

The attacker can guarantee success if t + 1 < d βˆ’ t βˆ’ 𝜐 β‡’ 𝑒 <

𝑒+πœβˆ’1 2

15

𝑒 𝜐 𝑒 𝑒+1 𝑒 βˆ’ (𝑒 βˆ’ 𝜐)

slide-16
SLIDE 16

Optimal patrolling strategy

Class of strategies: continue with probability π‘ž, else turn around Theorem: In the optimal strategy, all robots are equidistant and face in the same direction. Proof sketch:

1. the probability of visiting the worst case segment between robots increases with increasing distance between the robots 2. making a move in different directions increases the distance

16

slide-17
SLIDE 17

Probability of penetration

For simplicity assume 𝜐 = 1 Probability of visiting 𝑑𝑗 at least once in next 𝑒 steps

= probability of visiting the absorbing end state from 𝑑𝑗 sum of each direction visited separately

17

slide-18
SLIDE 18

Probability of penetration

18

All computations are symbolic. The result are functions π‘žπ‘žπ‘’π‘—: 0,1 β†’ [0,1].

slide-19
SLIDE 19

Optimal turn probability

Maximin value for π‘ž Each line represents one segment (π‘žπ‘žπ‘’π‘—) Iterate all pairs of intersection and maximal points to find solution

it is all polynomials

19

slide-20
SLIDE 20

Perimeter patrol – summary

Split the perimeter to segments traversable in unit time Distribute patrollers uniformly along the perimeter Coordinate them to always face the same way Continue with probability π‘ž turn around with probability (1 βˆ’ π‘ž)

20

slide-21
SLIDE 21

Area patrolling

Basilico et al.: Patrolling security games: Definition and algorithms for solving large instances with single patroller and single intruder. AIJ 2012.

21

slide-22
SLIDE 22

Area patrolling - Formal model

Environment represented as a graph Targets π‘ˆ = 6,8,12,14,18 Penetration time 𝑒(𝑒) Target values

(𝑀𝑒 𝑒 ,𝑀𝑏 𝑒 )

Defender: Markov policy Attacker: wait, attack(t)

22

slide-23
SLIDE 23

Solving zero-sum patrolling game

We assume βˆ€π‘’ ∈ π‘ˆ ∢ 𝑀𝑏 𝑒 = 𝑀𝑒 𝑒 𝑏 𝑗, π‘˜ = 1 if the patrol can move form 𝑗 to π‘˜ in one step; else 0 𝑄

𝑑(𝑒, β„Ž) is the probability of stopping an attack at target 𝑒 started when the patrol was at node β„Ž

𝛿𝑗,π‘˜

π‘₯,𝑒 is the probability that the patrol reaches node π‘˜ from 𝑗 in π‘₯ steps without visiting target 𝑒 23

𝛽𝑗,π‘˜ is a probability of moving from 𝑗 to π‘˜

slide-24
SLIDE 24

AI (GT) problems can often be solved by transformation to MP

27