Task-Oriented Active Perception and Planning in Environments with - - PowerPoint PPT Presentation

task oriented active perception and planning in
SMART_READER_LITE
LIVE PREVIEW

Task-Oriented Active Perception and Planning in Environments with - - PowerPoint PPT Presentation

Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics MAHSA GHASEMI, ERDEM ARIN BULGUR, AND UFUK TOPCU INTERNATIONAL CONFERENCE ON MACHINE LEARNING JULY 12-18, 2020 Integrating Data into Decision Making


slide-1
SLIDE 1

Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics

MAHSA GHASEMI, ERDEM ARINÇ BULGUR, AND UFUK TOPCU

INTERNATIONAL CONFERENCE ON MACHINE LEARNING JULY 12-18, 2020

slide-2
SLIDE 2

Integrating Data into Decision Making Process

Setting

  • Sequential decision making
  • Partial knowledge of environment
  • Continual information gathering

perception planning

Challenge

How to simultaneously perceive and plan with efficiency and performance guarantee?

Contributions

1. Provide guarantee on task success 2. Characterize information utility 3. Guide active perception while planning

1

slide-3
SLIDE 3

Task-Oriented Active Perception and Planning

Update belief Divergence test MAP estimation

  • f state attributes

Take one action Find an “informative state” Gather information Go to informative state Risk due to uncertainty Synthesize

  • ptimal policy

high low high low

2

slide-4
SLIDE 4

𝑡0 𝑡2 𝑡1

(a, 0.7) (a, 0.3) (d,1) (c,1) (b,1) (c,1) (d,1)

System Dynamics as Markov Decision Process

4

An MDP

An MDP is a tuple

  • is a finite discrete state space
  • is an initial state
  • is a finite discrete action space
  • is a probabilistic transition

function such that for all and for all ,

Memoryless deterministic policies Induced Markov chain

  • is such that for all ,
slide-5
SLIDE 5

Environment Model and Observation Model

5

An MDP with partial semantics

An environment model is a tuple

  • is a finite discrete state space
  • is a set of atomic propositions
  • is a true labeling function

𝑡0 𝑡2 𝑡1

(a, 0.7) (a, 0.3) (d,1) (c,1) (b,1) (c,1) (d,1) 𝑞 ∶ 0.8 ¬ 𝑞 ∶ 0.2 𝑞 ∶ 1.0 ¬ 𝑞 ∶ 0.0 𝑞 ∶ 0.0 ¬ 𝑞 ∶ 1.0

Belief at time is a probabilistic labeling function such that for all , . An observation model is a joint probability distribution

.

slide-6
SLIDE 6

Task Specification with Linear Temporal Logic

  • Linear temporal logic (LTL): A formal language with logical and temporal operators

▪ Suitable for high-level task specification ▪ Verifiable

▪ Qualitative (almost surely) ▪ Quantitative (probabilistically)

▪ Close to human language

▪ Formal translation of natural language instructions into LTL specifications [E.g., LTLMoP toolkit by Finucane, Jing and Hadas Kress-Gazit, 2010]

6

slide-7
SLIDE 7

Automaton Representation of Task

  • Task specification as LTL formula (with probabilistic guarantee)

An automaton Do not crash with

  • bstacles until you

reach door 1

  • r

Do not go to door 2 until you find the key Do not crash with

  • bstacles until you

reach door 2 and

  • An LTL formula can be transformed

into an automaton

▪ A transition system for a task ▪ Captures task progress ▪ A run ending in the accepting state completes the task

7

slide-8
SLIDE 8

Formal Problem Statement

Given

  • An MDP
  • An environment model with unknown

labeling function

  • An observation model
  • A syntactically co-safe LTL task

specification Find A policy that maximizes the probability of satisfying the task conditioned on the true labeling function, i.e.,

8

slide-9
SLIDE 9

Task-Oriented Active Perception and Planning

9

slide-10
SLIDE 10

Task-Oriented Active Perception and Planning

Perception module receives data sampled according to the observation model

10

slide-11
SLIDE 11

Task-Oriented Active Perception and Planning

The agent updates its learned model of the environment in a Bayesian approach

  • Assumption: Atomic propositions are

mutually independent

  • Frequentist update if an observation

model unavailable

11

slide-12
SLIDE 12

Task-Oriented Active Perception and Planning

The agent checks whether its learned model of the environment has significantly changed

  • Jensen-Shannon divergence
  • A hyperparameter determining

the frequency of replanning The agent estimates the most probable environment configuration

  • According to the current model of

the environment

  • Maximum a posteriori estimation

12

slide-13
SLIDE 13

Task-Oriented Active Perception and Planning

The agent synthesizes an optimal policy according to the estimated environment configuration

  • Generating the product MDP (dynamics + task)
  • Computing the optimal policy using a linear

program

13

slide-14
SLIDE 14

Task-Oriented Active Perception and Planning

The agent assesses the risk due to the perception uncertainties

  • Statistical verification of the induced Markov chain
  • Defining a risk parameter
  • A hyperparameter determining the willingness of the agent to risk

14

slide-15
SLIDE 15

Task-Oriented Active Perception and Planning

The agent finds an active perception strategy to reduce its perception uncertainty

  • Local search over a bounded horizon
  • Criteria:

▪ Forward and backward reachability ▪ Remaining in the same stage of the task ▪ Reducing task-related uncertainty

15

slide-16
SLIDE 16

Drone Navigation in Simulated Urban Environment

[1] From https://github.com/microsoft/AirSim

Drone’s view Segmented view Depth view

  • AirSim[1] simulation environment
  • A drone navigating in an urban

environment

  • Task: Reach a flagged building while

avoiding collision

  • Dynamics: Planar motion with constant

altitude

  • Sensing:

▪ Exact localization ▪ 4 RGB cameras with 90° field of view ▪ 4 depth sensing cameras with 90° field of view

16

slide-17
SLIDE 17

Processing Image and Depth Data

17

slide-18
SLIDE 18

Simulation Results

Navigation with exact knowledge

  • f the semantic labeling

Navigation with the proposed task-

  • riented active perception and planning

18

slide-19
SLIDE 19

Conclusion and Future Directions

Conclusion:

  • Studied planning in environments with partially known semantics

▪ Guarantee over task performance ▪ Assessment of risk due to imperfect knowledge

  • Proposed a task-oriented active perception and planning framework that

integrates learning through perception with decision-making under uncertainty

19

Future Directions:

  • Extending the framework to settings with uncertain or unknown dynamics
  • Using calibrated neural networks for perception module
  • Incorporating side knowledge on the correlation between the atomic

propositions

slide-20
SLIDE 20

Thank you!

Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics

Mahsa Ghasemi, Erdem Arınç Bulgur, and Ufuk Topcu