15-780 – Graduate Artificial Intelligence: Adversarial attacks and provable defenses
- J. Zico Kolter (this lecture) and Ariel Procaccia
Carnegie Mellon University Spring 2018 Portions base upon joint work with Eric Wong
1
15-780 Graduate Artificial Intelligence: Adversarial attacks and - - PowerPoint PPT Presentation
15-780 Graduate Artificial Intelligence: Adversarial attacks and provable defenses J. Zico Kolter (this lecture) and Ariel Procaccia Carnegie Mellon University Spring 2018 Portions base upon joint work with Eric Wong 1 Outline Adverarial
1
2
3
4
∆ ∞≤휖
5
6
7
8
9
10
11
휃
휃
∆ ∞≤휖 ℓ(ℎ휃 𝑦푖 + Δ ⋅ 𝑧푖)
휃
∆ ∞≤휖 ℓ(ℎ휃 𝑦 + Δ ⋅ 𝑧) = ℓ ℎ휃 𝑦 ⋅ 𝑧 − 𝜗 𝜄 1
∆ ∞≤휖 ℓ(ℎ휃 𝑦 + Δ ⋅ 𝑧) = ℓ
∆ ∞≤휖 ℎ휃(𝑦 + Δ) ⋅ 𝑧
∆ ∞≤휖 𝜄푇 𝑦 + Δ ⋅ 𝑧
∆ ∞≤휖 𝜄푇 Δ = −𝜗 𝜄 1
12
13
14
15
16
17
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖 , 0},
18
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖 , 0},
19
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖 , 0},
20
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖 , 0},
21
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖 , 0},
22
푧,푧̂
푖+1 = 𝑋푖𝑨푖 + 𝑐푖,
푖, 𝑨푖) ∈ 𝒟 ℓ푖, 𝑣푖 ,
23
24
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
25
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
26
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
27
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
28
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
29
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
30
푧
휈,훼
푇
푘−1 푖=1 푘−1 푖=1
31
휃
푚 푖=1
휃
푚 푖=1
32
33
34
35 1.10% 17% 1.80% 5% 100% 44% 5.80% 35% 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Standard deep network Robust linear classifier Our method Ragunathan et al., 2018
Error Robust error bound
36 1.10% 1.80% 50% 3.90% 82% 4.10% 100% 5.80% 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Standard training Our method
No attack FGSM PGD Robust bound
37
38
39