Towards Best-Effort Autonomy R udiger Ehlers University of Bremen - PowerPoint PPT Presentation

Towards Best-Effort Autonomy R¨ udiger Ehlers University of Bremen Dagstuhl Seminar 17071, February 2017 Based on Joint work with Salar Moarref & Ufuk Topcu (CDC 2016) 1

Motivation Highly autonomous systems... ... degrade in performance over time ... need to work correctly in off-nominal conditions ... need to adapt without the need of a human operator 2

Motivation Highly autonomous systems... ... degrade in performance over time ... need to work correctly in off-nominal conditions ... need to adapt without the need of a human operator Problem: We do not always know in advance how they are degrading... ...so we should be able to synthesize an adapted strategy in the field 2

Connecting Theory and Practice.... Specification Estimated Probabilities Control Policy MDP Computation Environment Specification Result 3

ω -regular control of MDPs – basic setting MDP 4

ω -regular control of MDPs – basic setting Trace MDP X 0 , X 1 , X 2 , . . . ρ = ρ 0 ρ 1 ρ 2 4

ω -regular control of MDPs – basic setting Trace | = ψ MDP X 0 , X 1 , X 2 , . . . ρ = ρ 0 ρ 1 ρ 2 4

ω -regular control of MDPs – basic setting Trace Policy / | = ψ MDP Controller X 0 , X 1 , X 2 , . . . ρ = ρ 0 ρ 1 ρ 2 4

Simple example: patrolling Motion primitives 0 . 2 0 . 8 Specification GF ( green ) ∧ GF ( red ) ∧ GF ( purple ) P ( ρ | = ψ ) ≥ ( 0 . 8 ) 4 ∧ GF ( blue ) 5

Using ω -regular specifications Ideas By assuming that traces are infinitely long , we can abstract from an unknown time until the system goes out of service . 6

Using ω -regular specifications Ideas By assuming that traces are infinitely long , we can abstract from an unknown time until the system goes out of service . Using temporal logic for the specification with operators such as “finally” and “globally”, we do not need to set time bounds for reaching the system goals , which helps with maximizing the probability for a trace to satisfy the specification . 6

Using ω -regular specifications Ideas By assuming that traces are infinitely long , we can abstract from an unknown time until the system goes out of service . Using temporal logic for the specification with operators such as “finally” and “globally”, we do not need to set time bounds for reaching the system goals , which helps with maximizing the probability for a trace to satisfy the specification . ω -regular specifications allow us to specify relatively complex behaviors easily. 6

But do ω -regular specifications always make sense? A thought experiment Assume that a robot has to patrol between two regions (i.e., it needs to visit both regions infinite often) 7

But do ω -regular specifications always make sense? A thought experiment Assume that a robot has to patrol between two regions (i.e., it needs to visit both regions infinite often) At every second, P ( robot breaks ) > 10 − 10 . 7

But do ω -regular specifications always make sense? A thought experiment Assume that a robot has to patrol between two regions (i.e., it needs to visit both regions infinite often) At every second, P ( robot breaks ) > 10 − 10 . What is the maximum probability of satisfying the specification that some control policy can achieve? 7

But do ω -regular specifications always make sense? A thought experiment Assume that a robot has to patrol between two regions (i.e., it needs to visit both regions infinite often) At every second, P ( robot breaks ) > 10 − 10 . What is the maximum probability of satisfying the specification that some control policy can achieve? It’s 0 as the robot will almost surely eventually break down. 7

Main question of the this paper How can we compute policies that work towards the satisfaction of ω -regular specifications even in the case of inevitable non-satisfication? 8

Motivational example problem 9

Solving the problem by intuition A fact 10

Solving the problem by intuition A fact We will all die, and it can happen any moment! 10

Solving the problem by intuition A fact We will all die, and it can happen any moment! Human behavior But that does not keep us from planning for the long term (e.g., getting a PhD)! 10

Solving the problem by intuition A fact We will all die, and it can happen any moment! Human behavior But that does not keep us from planning for the long term (e.g., getting a PhD)! Rationale We normally ignore the risk of catastrophic but very sparse events in decision making 10

Solving the problem by intuition A fact However... We will all die, and it can ... while planning for the long happen any moment! term, humans minimize the risk of catastrophic events. Human behavior But that does not keep us from planning for the long term (e.g., getting a PhD)! Rationale We normally ignore the risk of catastrophic but very sparse events in decision making 10

Solving the problem by intuition A fact However... We will all die, and it can ... while planning for the long happen any moment! term, humans minimize the risk of catastrophic events. Human behavior Example But that does not keep us from planning for the long term Not doing risky driving (e.g., getting a PhD)! Rationale We normally ignore the risk of catastrophic but very sparse events in decision making 10

Solving the problem by intuition A fact However... We will all die, and it can ... while planning for the long happen any moment! term, humans minimize the risk of catastrophic events. Human behavior Example But that does not keep us from planning for the long term Not doing risky driving (e.g., getting a PhD)! So what we want is... Rationale ...a method to compute We normally ignore the risk of risk-averse policies that are at catastrophic but very sparse the same time optimistic that events in decision making the catastrophic event does not happen. 10

Towards optimistic, but risk-averse policies (1) Try 1 Compute policies that after reaching a goal maximize the probability of reaching the respective next goal. 11

Towards optimistic, but risk-averse policies (1) Try 1 Compute policies that after reaching a goal maximize the probability of reaching the respective next goal. Example Goal 1 Goal 2 GF ( goal 1 ) ∧ GF ( goal 2 ) ∧ G ( ¬ crash ) Specification: 10 − 10 (every second) Prob. car breaks: 11

Towards optimistic, but risk-averse policies (2) Try 2 (similar to the work by Svorenova et al., 2013) Compute policies that maximize some value p such that whenever a goal is reached, the probability of reaching the respective next goal is at least p . 12

Towards optimistic, but risk-averse policies (2) Try 2 (similar to the work by Svorenova et al., 2013) Compute policies that maximize some value p such that whenever a goal is reached, the probability of reaching the respective next goal is at least p . The same example as before Goal 1 Goal 2 Specification: GF ( goal 1 ) ∧ GF ( goal 2 ) ∧ G ( ¬ crash ) 10 − 10 (every second) Prob. car breaks: 12

Towards optimistic, but risk-averse policies (3) But what about general ω -regular specifications? Example: ( GF ( red ) ∧ ( ¬ blue U green )) ∨ ( FG ( ¬ blue ) ∧ GF ( yellow )) What are the goals here and how can we compute risk-averse policies? 13

Towards optimistic, but risk-averse policies (3) But what about general ω -regular specifications? Example: ( GF ( red ) ∧ ( ¬ blue U green )) ∨ ( FG ( ¬ blue ) ∧ GF ( yellow )) What are the goals here and how can we compute risk-averse policies? Idea Let the policy declare the goals. Then we can compute a policy together with its declaration. 13

Declaring goals ( FG ( red ) F ( blue ∧ XG ¬ green ) ∨ ∨ G ¬ green Specification GF (( green ∧ ( ¬ blue U red )) ∨ ¬ green 0 green ¬ red Deterministic ¬ blue Parity Automaton blue red 2 1 ∧¬ red red 14

Declaring goals (2) ¬ green 0 green ¬ red ¬ blue blue red 2 1 ∧¬ red red Definition of parity acceptance A parity automaton accepts a trace if the highest color that occurs infinitely often along the automaton’s run for the trace is even . 15

Declaring goals (2) ¬ green 0 green ¬ red ¬ blue blue red 2 1 ∧¬ red red Definition of parity acceptance A parity automaton accepts a trace if the highest color that occurs infinitely often along the automaton’s run for the trace is even . So what are possible goals to be reached? Colors 0 and 2. 15

Declaring goals (3) Main idea We require the system to decrease goal colors at most k times (for some k ∈ N ), and whenever an odd-colored state is visited, the goal color must be higher than the odd color. 16

Declaring goals (3) Main idea We require the system to decrease goal colors at most k times (for some k ∈ N ), and whenever an odd-colored state is visited, the goal color must be higher than the odd color. Effect All infinite traces satisfying this new condition satisfy the original parity objective as well. 16

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen - PowerPoint PPT Presentation

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen Dagstuhl Seminar 17071, February 2017 Based on Joint work with Salar Moarref & Ufuk Topcu (CDC 2016) 1 Motivation Highly autonomous systems... ... degrade in

Autonomy Trading and Financial Statistics Autonomy Historical Trading Performance January 3, 2006

DoD Priorities for Autonomy Research and Development MORLEY O. STONE, ST, PhD Autonomy PSC Lead

Autonomy Overview Autonomy Overview Neil Goldfarb July 24, 2008 y , FUJITSU LATIN AMERICA

Territorial Autonomy as a Form of Conflict-Management in Southeastern Europe Dr Soeren Keil

Enhanced Judicial Autonomy, Enhanced Judicial Autonomy, Accountability, Efficiency, and Improved

Having Impact Matters Jesper Richter-Reichhelm (@rirei) Daniel Pink Daniel Pink Autonomy

Feder ederal al Time Time and and Effort Effort Reporting Requirements Reporting

City of Piedmont Best Best & Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

GOOGLE NEXUS 5X & 6P MAINLINING EFFORT JEREMY MCNICOLL JEREMYMC@REDHAT.COM $> WHOAMI 2

THE AWARD CATEGORIES Best House Best Apartment Best Alteration and Renovation

41 1 Sustainable Performance US Dollar Best Trade Best Customer The BIZZ Qatar Corporate

Learning Steering for Parallel Autonomy: Handling Ambiguity in End-to-End Driving Alexander Amini

Autonomy Overview DRAFT Preliminary | Subject to Further Review and Evaluation January 2011

HAT Tricks: Understanding Human Autonomy Teaming through Applications Bimal Aponso SAE/NASA

Party Autonomy and Choice of Law Vsevolod Volkov WWW.INTEGRITES.COM ROADMAP . Choice of Law.

Integrating Cognition, Emotion and Autonomy Tom Ziemke School of Humanities & Informatics

Social Choice CMPUT 654: Modelling Human Strategic Behaviour S&LB 9.1-9.4 Lecture

Computing risk averse equilibrium in incomplete market Henri Gerard Andy Philpott, Vincent

Time consistency and optimal stopping of risk averse multistage stochastic programs A. Shapiro

Incorporating ESG into Investment Strategy Drive performance, attract investors, and increase NOI

Asset Pricing Chapter XI. The Martingale Measure: Part I June 20, 2006 Asset Pricing 11.1

CS 573: Artificial Intelligence Markov Decision Processes Dan Weld University of Washington

Capital Structure: Perfect Markets (Welch, Chapters 6 and 17) Ivo Welch UCLA Anderson School,

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: Reid Simmons (Carnegie Mellon

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen - PowerPoint PPT Presentation

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen Dagstuhl Seminar 17071, February 2017 Based on Joint work with Salar Moarref & Ufuk Topcu (CDC 2016) 1 Motivation Highly autonomous systems... ... degrade in

Autonomy Trading and Financial Statistics Autonomy Historical Trading Performance January 3, 2006

DoD Priorities for Autonomy Research and Development MORLEY O. STONE, ST, PhD Autonomy PSC Lead

Autonomy Overview Autonomy Overview Neil Goldfarb July 24, 2008 y , FUJITSU LATIN AMERICA

Territorial Autonomy as a Form of Conflict-Management in Southeastern Europe Dr Soeren Keil

Enhanced Judicial Autonomy, Enhanced Judicial Autonomy, Accountability, Efficiency, and Improved

Having Impact Matters Jesper Richter-Reichhelm (@rirei) Daniel Pink Daniel Pink Autonomy

Feder ederal al Time Time and and Effort Effort Reporting Requirements Reporting

City of Piedmont Best Best &amp; Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

GOOGLE NEXUS 5X &amp; 6P MAINLINING EFFORT JEREMY MCNICOLL JEREMYMC@REDHAT.COM $&gt; WHOAMI 2

THE AWARD CATEGORIES Best House Best Apartment Best Alteration and Renovation

41 1 Sustainable Performance US Dollar Best Trade Best Customer The BIZZ Qatar Corporate

Learning Steering for Parallel Autonomy: Handling Ambiguity in End-to-End Driving Alexander Amini

Autonomy Overview DRAFT Preliminary | Subject to Further Review and Evaluation January 2011

HAT Tricks: Understanding Human Autonomy Teaming through Applications Bimal Aponso SAE/NASA

Party Autonomy and Choice of Law Vsevolod Volkov WWW.INTEGRITES.COM ROADMAP . Choice of Law.

Integrating Cognition, Emotion and Autonomy Tom Ziemke School of Humanities &amp; Informatics

Social Choice CMPUT 654: Modelling Human Strategic Behaviour S&amp;LB 9.1-9.4 Lecture

Computing risk averse equilibrium in incomplete market Henri Gerard Andy Philpott, Vincent

Time consistency and optimal stopping of risk averse multistage stochastic programs A. Shapiro

Incorporating ESG into Investment Strategy Drive performance, attract investors, and increase NOI

Asset Pricing Chapter XI. The Martingale Measure: Part I June 20, 2006 Asset Pricing 11.1

CS 573: Artificial Intelligence Markov Decision Processes Dan Weld University of Washington

Capital Structure: Perfect Markets (Welch, Chapters 6 and 17) Ivo Welch UCLA Anderson School,

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: Reid Simmons (Carnegie Mellon

City of Piedmont Best Best & Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

GOOGLE NEXUS 5X & 6P MAINLINING EFFORT JEREMY MCNICOLL JEREMYMC@REDHAT.COM $> WHOAMI 2

Integrating Cognition, Emotion and Autonomy Tom Ziemke School of Humanities & Informatics

Social Choice CMPUT 654: Modelling Human Strategic Behaviour S&LB 9.1-9.4 Lecture