Adversarial Training and Provable Defenses: Bridging the Gap
π β
π¦ S 0 π¦ π β β π πβ1 β β― β β π 1 β π = β π 1 2 3 β π β π β π Conv + ReLU Conv + ReLU Linear β² = β π (π¦β²) π¦ β² β π 0 (π¦) β² β² π¦ 1 π¦ 3 π¦ 2
π π β π π¦ β² + π < 0, βπ¦ β² β π 0 (π¦) 1 2 3 β π β π β π Conv + ReLU Conv + ReLU Linear β² = β π (π¦β²) π¦ β² β π 0 (π¦) β² β² π¦ 1 π¦ 3 π¦ 2
1 2 3 β π β π β π Conv + ReLU Conv + ReLU Check output condition: Linear β² + π < 0, βπ¦ 3 β² β π· 3 π¦ π π π¦ 3 π· 0 π¦ = π 0 (π¦) π· 1 π¦ π· 2 π¦ π· 3 π¦ Guarantees: π π β π π¦ β² + π < 0, βπ¦ β² β π 0 (π¦)
β π¦ β² βπ 0 (π¦) β(β π π¦ β² , π§) min π πΉ π¦,π§ ~πΈ max lower upper
upper β’ β’ lower β’ β’ β’ β’
1 2 3 β π β π β π β² β² β² π¦ 1 π¦ 2 π¦ 3 β² π¦ 2 β² π¦ 3 β² π¦ 1 π· 0 π¦ = π 0 (π¦) π· 1 π¦ π· 2 π¦ π· 3 π¦ β² + π < 0 β certification fails π π π¦ 3
π 0 (π¦) π· 1 π¦ , π· 2 π¦ , π· 3 (π¦)
2 1 3 β π β π β π Conv + ReLU Conv + ReLU β² β² π¦ 2 π¦ 1 Linear β² , π§) β(π¦ 3 β² , π§) πΌ π β(π¦ 3 β² π¦ 2 β² π¦ 3 β² π¦ 1 π· 0 π¦ = π 0 (π¦) π· 1 π¦ π· 2 π¦ π· 3 π¦
projection
π· π π¦ = π π + π΅ π π π β β1, 1 π π π π π΅ π π β π π 0 = π¦ π΅ 0 = ππ½
Key idea π¦ β² = π π + π΅ π π β² π¦ 1 π 1 β² π 2 π¦ 2 β² β 2π 1 β π 2 π¦ 1 β² β π 1 + π 2 π¦ 2
Method Accuracy (%) Certified Robustness (%)
Method Accuracy (%) Certified Robustness (%)
Recommend
More recommend