Adversarial Decision-Making Brian J. Stankiewicz University of - PowerPoint PPT Presentation

Introduction Empirical Studies Future Directions/Ideas Summary & Conclusions Adversarial Decision-Making Brian J. Stankiewicz University of Texas, Austin Department Of Psychology & Center for Perceptual Systems & Consortium for Cognition and Computation February 7, 2006 Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Collaborators University of Texas, Austin Matthew deBrecht Kyler Eastman JP Rodman University XXI / Army Research Labs Chris Goodson Anthony Cassandra University of Minnesota Gordon E. Legge National Institute of Health Erik Schlicht Paul Schrater SUNY Plattsburgh Air Force Office of Scientific J. Stephan Mansfield Research Army Research Lab Sam Middlebrooks Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Overview 1 Description of sequential decision making with uncertainty. 2 Description of Optimal Decision Maker Partially Observable Markov Decision Process 3 Adversarial Sequential Decision Making Task Variant of “Capture the Flag” Empirical studies comparing human performance to optimal performance in Adversarial Decision Making Task. 4 Future Directions and Ideas How to model and understand “Policy Shifts” Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Sequential Decision Making with Uncertainty Many decision making tasks involve a sequence of decisions in which actions have both immediate and long-term effects. Certain amount of uncertainty about the true state. True state is not directly observable but must be inferred from actions and observations. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions SDMU: Examples Medical diagnosis and intervention Business investment and development Politics Military Decision Making Career Development Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Questions How efficiently do humans solve sequential decision making with uncertainty tasks? If subjects are inefficient, can we isolate the Cognitive Bottleneck ? Memory Computation Strategy Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions SDMU: Problem Space 1 Interested in defining problems such that ‘rational’ answers can be computed. 2 Allows us a ‘benchmark’ by which to compare humans 3 Partially Observable Markov Decision Process Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Standard MDP Notation S: Set of states in the domain Set of possible ailments that a patient can have. E.g., Cancer, cold, flu, etc. A: set of actions an agent can perform E.g., Measure blood pressure, prescribe antibiotics, etc. O: S × A → O set of observations generated “Normal”: Blood pressure. T: S × A → S ′ (transition function) E.g., Probability of becoming “Healthy” given antibiotics. R: S × A → ℜ Environment/Action Reward $67.00 to measure blood pressure Putterman 1994 Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Belief Updating p ( s ′ | b , o , a ) = p ( o | s ′ , b , a ) p ( s ′ | b , a )) (1) p ( o | b , a ) Update current Belief given the previous action (a) and current observation (o) and the belief vector (b). E.g., “What is the likelihood that the patient has cancer given that his/her blood pressure is normal?” Belief is updated for all possible states. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Computing Expected Value � � � τ ( b , a , b ′ ) V ( b ′ ) V ( b ) = max ρ ( b , a ) + (2) a ∈ A b ′ ∈ B ρ ( b , a ): Immediate reward for doing action a given the current belief b . τ ( b , a , b ′ ): Probability of transition to new belief ( b ′ ) from current belief ( b ) given actions a . V ( b ′ ): The expected value in the new belief state b ′ . Optimal observer chooses the action that maximizes the expected reward. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem 1 Tiger Problem Simple example of Sequential Decision Making under Uncertainty task. Illustration to provide intuitive understanding of POMDP architecture. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: States Two doors: Behind one door is Tiger Behind other door is “pot of gold” Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Actions Three Actions: Listen 1 Open Left-Door 2 Open Right-Door 3 Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Observations Two Observations: Hear Tiger Left ( Hear Left ) 1 Hear Tiger Right ( Hear Right ) 2 Observation Structure p ( Hear Left | Tiger Left , Listen ) = 0 . 85 p ( Hear Right | Tiger Right , Listen ) = 0 . 85 p ( Hear Right | Tiger Left , Listen ) = 0 . 15 p ( Hear Left | Tiger Right , Listen ) = 0 . 15 Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Rewards Table: Reward Structure for Tiger Problem Tiger=Left Tiger=Right Listen -1 -1 Open-Left -100 10 Open-Right 10 -100 Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Immediate Reward Immediate Rewards. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Expected Reward Expected reward functions for multiple future actions with an infinite horizon. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Policy From expected reward, generate the optimal Policy ( π ). The policy chooses the action (a) that maximizes the expected reward for the current belief. Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions Tiger Problem: Policy Table: Belief Updating for Tiger Problem Act. Num Action Observation p ( Tiger Left ) 0 —- —- 0.5 1 Listen 0.85 Hear Left 2 Listen 0.9698 Hear Left 3 Open-Right 0.5 Reward Stankiewicz MIT MURI 2006

Introduction Overview Empirical Studies Formulating optimal decision making process. Future Directions/Ideas Tiger Problem Summary & Conclusions POMDP: Computing Expected Value 1 Using a POMDP we can generate the optimal policy graph for a Sequential Decision Making Under Uncertainty Task . Policy graph provides us with the optimal action given a belief about the true state. 2 Using a POMDP we can compute the Expected Reward given the initial belief state and optimal action selection. Using the optimal expected reward structure we can compare human performance to the optimal performance. By comparing human behavior to the optimal Expected Reward we can get a measure of efficiency . Stankiewicz MIT MURI 2006

Introduction Description Empirical Studies Methods Future Directions/Ideas Results Summary & Conclusions Empirical studies 1 Capture The Flag Enemy is attempting to capture your ‘flag’. Locate and “destroy” enemy before flag is captured. When enemy is destroyed ‘Declare’ Mission Accomplished. Maximize reward. Stankiewicz MIT MURI 2006

Introduction Description Empirical Studies Methods Future Directions/Ideas Results Summary & Conclusions Capture The Flag: Task 5x5 arena Single, enemy Reconaissance to any of the 25 locations Artillery to any of the 25 locations Enemy starts in upper-two rows. Goal : Locate & Destroy the enemy before reaching flag. Stankiewicz MIT MURI 2006

Adversarial Decision-Making Brian J. Stankiewicz University of - PowerPoint PPT Presentation

Introduction Empirical Studies Future Directions/Ideas Summary & Conclusions Adversarial Decision-Making Brian J. Stankiewicz University of Texas, Austin Department Of Psychology & Center for Perceptual Systems & Consortium for

6 Decision- -Making Making MVC (revisited) 6 Decision MVC (revisited) decision

DECISION MAKING readysetpresent.com Decision Making Program Objectives ( 1 of 2 ) To examine

Decision Making 1 Decision Making Skills Establishing a positive decision-making environment.

Decision Making Under Decision Making . . . General Set Uncertainty: Proof of This Result

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

S C DECISION E N C E decision science SDS CMU What is Decision Science? Behavioral

Supported Decision-Making in Wisconsin Today we will talk about: The concept of Supported

MI MI and Shared MI MI and Shared and Shared Decision Making and Shared Decision Making

Asias Last Tiger Emerging February 2019 BUILDING ON MACROECONOMIC SUCCESS A refocusing of

TIGER CUB 63: ELECTRICAL SYSTEM Ricky Feig COMPONENTS: Lighting and ignition Switch Wire

Presentation to RIU Good Oil Conference David Maxwell, Managing Director 14 September 2017 Not

INTEGRATED PEST MANAGEMENT INTEGRATED PEST MANAGEMENT INNOVATION LAB (IPM-IL), KENYA INNOVATION

INVESTOR PRESENTATION SECOND QUARTER 2015 FORWARD LOOKING INFORMATION This presentation is for

THE DRAGON & TIGER COMPANY About HISTORY Established in 2015 , THE DRAGON and TIGER

Corporate Presentation 1 July 2013 DISCLAIMER This presentation does not constitute a

TIGER RESOURCES LIMITED February 2013 Brad Marwood, Managing Director 1 Disclaimer Forward

Adversarial Decision-Making Brian J. Stankiewicz University of - PowerPoint PPT Presentation

Introduction Empirical Studies Future Directions/Ideas Summary & Conclusions Adversarial Decision-Making Brian J. Stankiewicz University of Texas, Austin Department Of Psychology & Center for Perceptual Systems & Consortium for

6 Decision- -Making Making MVC (revisited) 6 Decision MVC (revisited) decision

DECISION MAKING readysetpresent.com Decision Making Program Objectives ( 1 of 2 ) To examine

Decision Making 1 Decision Making Skills Establishing a positive decision-making environment.

Decision Making Under Decision Making . . . General Set Uncertainty: Proof of This Result

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

S C DECISION E N C E decision science SDS CMU What is Decision Science? Behavioral

Supported Decision-Making in Wisconsin Today we will talk about: The concept of Supported

MI MI and Shared MI MI and Shared and Shared Decision Making and Shared Decision Making

Asias Last Tiger Emerging February 2019 BUILDING ON MACROECONOMIC SUCCESS A refocusing of

TIGER CUB 63: ELECTRICAL SYSTEM Ricky Feig COMPONENTS: Lighting and ignition Switch Wire

Presentation to RIU Good Oil Conference David Maxwell, Managing Director 14 September 2017 Not

INTEGRATED PEST MANAGEMENT INTEGRATED PEST MANAGEMENT INNOVATION LAB (IPM-IL), KENYA INNOVATION

INVESTOR PRESENTATION SECOND QUARTER 2015 FORWARD LOOKING INFORMATION This presentation is for

THE DRAGON &amp; TIGER COMPANY About HISTORY Established in 2015 , THE DRAGON and TIGER

Corporate Presentation 1 July 2013 DISCLAIMER This presentation does not constitute a

TIGER RESOURCES LIMITED February 2013 Brad Marwood, Managing Director 1 Disclaimer Forward

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

THE DRAGON & TIGER COMPANY About HISTORY Established in 2015 , THE DRAGON and TIGER