Counteracting Adversarial Attacks in Autonomous Driving Qi Sun 1 , - PowerPoint PPT Presentation

Counteracting Adversarial Attacks in Autonomous Driving Qi Sun 1 , Arjun Ashok Rao 1 , Xufeng Yao 1 , Bei Yu 1 , Shiyan Hu 2 1 The Chinese University of Hong Kong 2 University of Southampton 1 / 21

Vision-Based Object Detection Classification Localization ◮ output: class label ◮ output: bounding box in image Object Detection: ◮ class label l ◮ bounding box in image, represented as vector ( x , y , w , h ) 2 / 21

Vision-Based Object Detection �� Region Proposal Network (RPN) �� Objectness scores Bounding box regression �� ◮ Generate k boxes, regress label scores and coordinates for the k boxes. �� ◮ Use some metrics ( e.g. , IoU) to measure the qualities of boxes. 3 / 21

Vision-Based Object Detection Faster R-CNN Vision-based object detection model. classifier RoI pooling proposals Region Proposal Network feature maps conv layers image 4 / 21

Stereo-Based Vision System A typical stereo-based multi-task object detection model ◮ Two sibling branches ( e.g. , RPN modules) which use left and right images as inputs. ◮ A single branch conducts a regression task, e.g. predict viewpoint. Sometimes there are several independent single branches. 5 / 21

�� Stereo-Based Vision System ◮ Take advantage of left and right images to detect cars. ◮ Conduct multiple 3D regression tasks based on the joint detection results. , ℎ z - x y (" # , % & ) " ) (" * , % + ) ( " ) ( " # Take advantage of left and right images. Multiple stereo-based tasks. 6 / 21

Adversarial Attacks ◮ Vision-based systems suffer from image perturbations (noises, dark light, signs, etc. ). ◮ Deep learning models are vulnerable to these perturbations. ◮ The security risk is especially dangerous for 3D object detection in autonomous driving. ◮ Adversarial attacks have been widely studied to simulate these perturbations. ◮ Two typical and widely used attack methods: Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). 7 / 21

Generate Adversarial Images Fast Gradient Sign Method (FGSM) ◮ Direction of gradient: sign ( ∇ x L ( θ, x , y )) , with loss function L ( θ, x , y ) . ◮ Generates new input image with constrained perturbation δ : x ′ = x + δ = x + ǫ · sign ( ∇ x L ( θ, x , y )) , (1) s . t . � δ � ≤ ǫ. 8 / 21

Generate Adversarial Images Fast Gradient Sign Method (FGSM) ◮ Direction of gradient: sign ( ∇ x L ( θ, x , y )) , with loss function L ( θ, x , y ) . ◮ Generates new input image with constrained perturbation δ : x ′ = x + δ = x + ǫ · sign ( ∇ x L ( θ, x , y )) , (1) s . t . � δ � ≤ ǫ. Projected Gradient Descent (PGD) ◮ Contains several attack steps: � x t + 1 = ( x t + α · sign ( ∇ x L ( θ, x , y ))) (2) x + S 8 / 21

Adversarial Training Traditional Training Method ◮ The typical form of most adversarial training algorithms involve training of target model on adversarial images. ◮ Adversarial training methods perform the following min-max training strategy shown as below: min max L ( x + δ, θ ; y ) , s . t . � δ � p ≤ ǫ, θ δ where � · � p is the ℓ p -norm. 9 / 21

Adversarial Training Traditional Training Method ◮ The typical form of most adversarial training algorithms involve training of target model on adversarial images. ◮ Adversarial training methods perform the following min-max training strategy shown as below: min max L ( x + δ, θ ; y ) , s . t . � δ � p ≤ ǫ, θ δ where � · � p is the ℓ p -norm. Stereo-based Training method min max δ l ,δ r L ( x l + δ l , x r + δ r , θ ; y ) , θ s . t . � δ l � p ≤ ǫ, � δ r � p ≤ ǫ where x l and x r represent left and right images, and δ l and δ r represent the perturbations on the left and right images respectively. 9 / 21

Stereo-Based Regularizer For sibling branches ◮ Let f l ( · ) and f r ( · ) denote the features learned from left and right images. ◮ Distance between left and right images: Left Box d ( x l , x r ) = � f l ( x l ) − f r ( x r ) � n . Right Box ◮ Distance between two images with perturbations: d ( x l + δ l , x r + δ r ) = � f l ( x l + δ l ) − f r ( x r + δ r ) � n . ◮ Add a margin m to reinforce the optimization of the distance function. d ( x l , x r ) = � f l ( x l ) − f r ( x r ) + m � n , d ( x l + δ l , x r + δ r ) = � f l ( x l + δ l ) − f r ( x r + δ r ) + m � n . 10 / 21

Stereo-Based Regularizer For sibling branches ◮ The distance after attacks should be close to the original distance: L b = | d ( x l + δ l , x r + δ r ) − d ( x l , x r ) | . 11 / 21

Stereo-Based Regularizer For sibling branches ◮ The distance after attacks should be close to the original distance: L b = | d ( x l + δ l , x r + δ r ) − d ( x l , x r ) | . For single branch ◮ The left and right features are used as the joint inputs: L m = � f m ( x l + δ l , x r + δ r ) − f m ( x l , x r ) � n . 11 / 21

Stereo-Based Regularizer For sibling branches ◮ The distance after attacks should be close to the original distance: L b = | d ( x l + δ l , x r + δ r ) − d ( x l , x r ) | . For single branch ◮ The left and right features are used as the joint inputs: L m = � f m ( x l + δ l , x r + δ r ) − f m ( x l , x r ) � n . New objective function L = L o + L b + L m , where L o is the original objective function. 11 / 21

Local Smoothness Optimization Adversarial Robustness through Local Linearization ◮ Encourage the loss to behave linearly in the vicinity of training data. ◮ Approximate the loss function by its linear Taylor expansion in a small neighborhood. ◮ Take f l ( · ) as an example, the first-order Taylor remainder h l ( ǫ, x l ) is given by : h l ( ǫ, x l ) = � δ l ∇ x l f l ( x l ) + f l ( x l + δ l ) − f l ( x l ) − δ l ∇ x l f l ( x l ) � n . ◮ Define γ l ( x l , ǫ ) as the maximum of h l ( ǫ, x l ) : γ l ( ǫ, x l ) = max � δ l � p ≤ ǫ h l ( ǫ, x l ) . (3) 12 / 21

Local Smoothness Optimization Relaxation of regularizers ◮ According to the triangle inequality, � f l ( x l + δ l ) − f l ( x l ) � n is further relaxed to be: � f l ( x l + δ l ) − f l ( x l ) � n ≈� δ l ∇ x l f l ( x l ) + f l ( x l + δ l ) − f l ( x l ) − δ l ∇ x l f l ( x l ) � n ≤� δ l ∇ x l f l ( x l ) � n + � f l ( x l + δ l ) − f l ( x l ) − δ l ∇ x l f l ( x l ) � n ≤� δ l ∇ x l f l ( x l ) � n + γ l ( x l , ǫ ) , 13 / 21

Local Smoothness Optimization Relaxation of regularizers ◮ According to the triangle inequality, � f l ( x l + δ l ) − f l ( x l ) � n is further relaxed to be: � f l ( x l + δ l ) − f l ( x l ) � n ≈� δ l ∇ x l f l ( x l ) + f l ( x l + δ l ) − f l ( x l ) − δ l ∇ x l f l ( x l ) � n ≤� δ l ∇ x l f l ( x l ) � n + � f l ( x l + δ l ) − f l ( x l ) − δ l ∇ x l f l ( x l ) � n ≤� δ l ∇ x l f l ( x l ) � n + γ l ( x l , ǫ ) , ◮ Accordingly, the regularization term L b is relaxed as: L b = | � f l ( x l + δ l ) − f r ( x r + δ r ) + m � n − � f l ( x l ) − f r ( x r ) + m � n | ≤ � f l ( x l + δ l ) − f r ( x l ) � n + � f l ( x r + δ r ) − f r ( x r ) � n ≤ � δ l ∇ x l f l ( x l ) � n + γ l ( ǫ, x l ) + � δ r ∇ x r f r ( x r ) � n + γ r ( ǫ, x r ) , where γ l ( ǫ, x l ) = max � δ l � p ≤ ǫ h l ( ǫ, x l ) and γ r ( ǫ, x r ) = max � δ r � p ≤ ǫ h r ( ǫ, x r ) . 13 / 21

Local Smoothness Optimization Relaxation of regularizers ◮ The regularization term for the single branch is relaxed as: L m = � f m ( x l + δ l , x r + δ r ) − f m ( x l , x r ) � n ≤ � δ l ∇ x l f m ( x l , x r ) + δ r ∇ x r f m ( x l , x r ) � n + γ m ( ǫ, x l , x r ) , where γ m ( ǫ, x l , x r ) is the maximum of the high-order remainder h m ( ǫ, x l , x r ) . 14 / 21

Local Smoothness Optimization Relaxation of regularizers ◮ The regularization term for the single branch is relaxed as: L m = � f m ( x l + δ l , x r + δ r ) − f m ( x l , x r ) � n ≤ � δ l ∇ x l f m ( x l , x r ) + δ r ∇ x r f m ( x l , x r ) � n + γ m ( ǫ, x l , x r ) , where γ m ( ǫ, x l , x r ) is the maximum of the high-order remainder h m ( ǫ, x l , x r ) . ◮ They are defined as follows: h m ( ǫ, x l , x r ) = � f m ( x l + δ l , x r + δ r ) − f m ( x l , x r ) − δ l ∇ x l f m ( x l , x r ) − δ r ∇ x r f m ( x l , x r ) � n , γ m ( ǫ, x l , x r ) = � δ l � p ≤ ǫ, � δ r � p ≤ ǫ. h m ( ǫ, x l , x r ) . max 14 / 21

Counteracting Adversarial Attacks in Autonomous Driving Qi Sun 1 , - PowerPoint PPT Presentation

Counteracting Adversarial Attacks in Autonomous Driving Qi Sun 1 , Arjun Ashok Rao 1 , Xufeng Yao 1 , Bei Yu 1 , Shiyan Hu 2 1 The Chinese University of Hong Kong 2 University of Southampton 1 / 21 Vision-Based Object Detection Classification

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

AUTONOMOUS DRIVING FRENCH NATIONAL PLAN NEW FRANCE FOR INDUSTRY (NFI) PLAN JF SENCERIN

Counteracting Denial-of-Sleep Attacks in Wake-up-based Sensing Systems Angelo T. Capossele,

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Adversarial Training Attacks on Deep Networks and Generative Adversarial Networks Erkut Erdem

AUTONOMOUS DRIVING IN AGRICULTURE LEADING TO AUTONOMOUS WORKSITE SOLUTIONS Dr. John F. Reid,

DEEP LEARNING IN THE FIELD OF AUTONOMOUS DRIVING AN OUTLINE OF THE DEPLOYMENT PROCESS FOR ADAS

T echnical and Legal Challenges for Urban Autonomous Driving Seung-Woo Seo, Prof. Vehicle

Visual Scene Understanding for Autonomous Driving Raquel Urtasun University of Toronto Oct 3,

Distracted Driving Jennifer Smith What is Distracted Driving? Driving while engaged in any

Self-Driving Cars As Edge Computing Devices Matt Ranney - @mranney Uber ATG Why Self-Driving?

Towards Robust LiDAR-based Perception in Autonomous Driving: General Black-box Adversarial Sensor

Friendly Adversarial Training: Attacks Which Do Not Kill Training Make Adversarial Learning

MINING IN A DAY COUNTERACTING FALLING COAL PRICE BY IMPROVEMENTS IN EFFICIENCY 1 st September,

Autonomous Driving: The Good The Bad and The Ugly When do you launch the product? Whos

S9932: LEARNING TO BOOST S9932: LEARNING TO BOOST ROBUSTNESS FOR ROBUSTNESS FOR AUTONOMOUS

Triangle Rasterization Sung-Eui Yoon ( ) ( ) C Course URL: URL

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Perceptron Algorithm An aside: a hyperplane is a perceptron. (single layer neural network, do you

Lecture 7: Arora Rao Vazirani Lecture Outline Part I: Semidefinite Programming Relaxation for

Cloth Simulation CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2018 Cloth

Finding Shortest Paths Shortest Path Problem Shortest Path Problem We are given a graph G = ( V ,

Buffons needle probability of rational product Cantor sets Izabella Laba The Abel

Inductive Theorem Proving Automated Reasoning Petros Papapanagiotou